Close
Close

Heapsort


All the three algorithms we have used so far didn't require any specific data structure but heapsort does. It is implemented using heaps. As stated earlier in this course, some of the algorithms we will study in this course will be implemented using some data structures. So, it is recommended to read articles on data structure first. However, if you are familiar with data structures, then you are good to go.

Heaps Recap


  • Heaps are complete binary trees implemented using an array
    • Parent of a node i is $\lfloor\frac{i}{2}\rfloor$
    • Left child of the node i is $2*i$
    • Right child of the node i is $(2*i)+1$
    • Max-heap property → The value of a node is greater than or equal to the value of its children i.e., $A[Parent[i]] \ge A[i]$ for all nodes i>1.
    • Min-heap property → The value of a node is either smaller or equal to the value of its children i.e., $A[Parent[i]] \le A[i]$ for all nodes i>1.
  • HEAPIFY is a function used to maintain the heap property of any heap. It is applied to a node when the children of the node are following the property of a heap but the node itself may be violating it. It runs in $O(\lg{n})$ time.
  • BUILD-HEAP is a function used to make a heap from an array. It runs in $O(n)$ time.

Let's start with heapsort

Working of Heapsort


Heapsort is implemented using a max-heap. In a max-heap, the maximum element is at the root of the tree and is also the first element of the array i.e., A[1].

maximum heap

Now, we want this maximum element to be at the last element of our sorted array. So, we swap it with the last element and then discard this element from the heap by reducing the size of the heap.

swapping maximum element with last and removing it

So, we have correctly put the largest element at the correct position, but in doing so, we have disturbed our heap or more precisely, the root of the heap because we haven't touched any child of the root yet. Thus they are still following the max-heap property.

calling heapify on the root of tree

heapify animation

We can easily deal with this problem by calling HEAPIFY on the root of the tree and that will make our tree a max-heap again, but this time the largest element won't be the part of the heap.

making maximum heap

Now, we will just repeat this process until every element of the heap is correctly placed in sorted order.

heapsort

steps of heapsort

steps of heapsort

heapsort animation

Now you have understood the concepts behind the heapsort, so let's write a code for the heapsort.

Coding Heapsort


We can easily implement the above logic in a code. We have to swap the first element of the array with the last element of the heap till the heap exists or the size of the heap is not 0.

while A.heap_size > 0
  swap(A[1], A[heap_size])

After this, we just have to decrease the size of the heap by 1 to exclude the last element (largest element) form the heap.

A.heap_size = A.heap_size-1

The last part is to make the array a max-heap again by calling MAX-HEAPIFY function on the root because only the root is disturbed in the entire process.

MAX-HEAPIFIY(A, 1)

Take note that we are using MAX-HEAPIFY function of a heap which is explained in the data structure course.

Summing up the above code blocks, we can develop the entire code for the heapsort as:

HEAPSORT(A)
  while A.heap_size > 0
    swap(A[1], A[A.heap_size])
    A.heap_size = A.heap_size-1
    MAX-HEPAPIFY(A, 1)
  • C
  • Python
  • Java
#include <stdio.h>

int tree_array_size = 11;
int heap_size = 10;

void swap( int *a, int *b ) {
  int t;
  t = *a;
  *a = *b;
  *b = t;
}

//function to get right child of a node of a tree
int get_right_child(int A[], int index) {
  if((((2*index)+1) < tree_array_size) && (index >= 1))
    return (2*index)+1;
  return -1;
}

//function to get left child of a node of a tree
int get_left_child(int A[], int index) {
    if(((2*index) < tree_array_size) && (index >= 1))
        return 2*index;
    return -1;
}

//function to get the parent of a node of a tree
int get_parent(int A[], int index) {
  if ((index > 1) && (index < tree_array_size)) {
    return index/2;
  }
  return -1;
}

void max_heapify(int A[], int index) {
  int left_child_index = get_left_child(A, index);
  int right_child_index = get_right_child(A, index);

  // finding largest among index, left child and right child
  int largest = index;

  if ((left_child_index <= heap_size) && (left_child_index>0)) {
    if (A[left_child_index] > A[largest]) {
      largest = left_child_index;
    }
  }

  if ((right_child_index <= heap_size && (right_child_index>0))) {
    if (A[right_child_index] > A[largest]) {
      largest = right_child_index;
    }
  }

  // largest is not the node, node is not a heap
  if (largest != index) {
    swap(&A[index], &A[largest]);
    max_heapify(A, largest);
  }
}

void build_max_heap(int A[]) {
  int i;
  for(i=heap_size/2; i>=1; i--) {
    max_heapify(A, i);
  }
}

void heapsort(int a[]) {
  while(heap_size > 0) {
    swap(&a[1], &a[heap_size]);
    heap_size = heap_size-1;
    max_heapify(a, 1);
  }
}

int main() {
  //tree is starting from index 1 and not 0
  int A[] = {0, 15, 20, 7, 9, 5, 8, 6, 10, 2, 1};
  build_max_heap(A);
  heapsort(A);

  //printing
  int i;
  for(i=1; i<=10; i++) {
    printf("%d\n",A[i]);
  }
  return 0;
}
heap_size = 10

# function to get right child of a node of a tree
def get_right_child(A, index):
  if((((2*index)+1) < len(A)) and (index >= 1)):
    return (2*index)+1
  return -1

# function to get left child of a node of a tree
def get_left_child(A, index):
    if(((2*index) < len(A)) and (index >= 1)):
        return 2*index
    return -1

# function to get the parent of a node of a tree
def get_parent(A, index):
  if ((index > 1) and (index < len(A))):
    return index//2
  return -1

def max_heapify(A, index):
  left_child_index = get_left_child(A, index)
  right_child_index = get_right_child(A, index)

  # finding largest among index, left child and right child
  largest = index

  if ((left_child_index <= heap_size) and (left_child_index>0)):
    if (A[left_child_index] > A[largest]):
      largest = left_child_index

  if ((right_child_index <= heap_size and (right_child_index>0))):
    if (A[right_child_index] > A[largest]):
      largest = right_child_index

  # largest is not the node, node is not a heap
  if (largest != index):
    A[index], A[largest] = A[largest], A[index]
    max_heapify(A, largest)

def build_max_heap(A):
  for i in range(heap_size//2, 0, -1):
    max_heapify(A, i)

def heapsort(a):
  global heap_size
  while(heap_size > 0):
    a[1], a[heap_size] = a[heap_size], a[1]
    heap_size = heap_size-1
    max_heapify(a, 1)

if __name__ == '__main__':
  # tree is starting from index 1 and not 0
  A = [0, 15, 20, 7, 9, 5, 8, 6, 10, 2, 1]
  build_max_heap(A)
  heapsort(A)

  print(A[1:11])
class Heap {
  public static int heapSize = 10;

  //function to get right child of a node of a tree
  public static int getRightChild(int A[], int index) {
    if((((2*index)+1) < A.length && (index >= 1)))
      return (2*index)+1;
    return -1;
  }

  //function to get left child of a node of a tree
  public static int getLeftChild(int A[], int index) {
      if(((2*index) < A.length && (index >= 1)))
          return 2*index;
      return -1;
  }

  //function to get the parent of a node of a tree
  public static int getParent(int A[], int index) {
    if ((index > 1) && (index < A.length)) {
      return index/2;
    }
    return -1;
  }

  public static void maxHeapify(int A[], int index) {
    int leftChildIndex = getLeftChild(A, index);
    int rightChildIndex = getRightChild(A, index);

    // finding largest among index, left child and right child
    int largest = index;

    if ((leftChildIndex <= heapSize) && (leftChildIndex>0)) {
      if (A[leftChildIndex] > A[largest]) {
        largest = leftChildIndex;
      }
    }

    if ((rightChildIndex <= heapSize && (rightChildIndex>0))) {
      if (A[rightChildIndex] > A[largest]) {
        largest = rightChildIndex;
      }
    }

    // largest is not the node, node is not a heap
    if (largest != index) {
      int temp;
      //swapping
      temp = A[largest];
      A[largest] = A[index];
      A[index] = temp;
      maxHeapify(A, largest);
    }
  }

  public static void buildMaxHeap(int A[]) {
    int i;
    for(i=heapSize/2; i>=1; i--) {
      maxHeapify(A, i);
    }
  }

  public static void heapSort(int a[]) {
    while (heapSize > 0) {
      int temp;
      temp = a[1];
      a[1] = a[heapSize];
      a[heapSize] = temp;
      heapSize--;
      maxHeapify(a, 1);
    }
  }

  public static void main(String[] args) {
    //tree is starting from index 1 and not 0
    int A[] = {0, 15, 20, 7, 9, 5, 8, 6, 10, 2, 1};
    buildMaxHeap(A);
    heapSort(A);
    for(int i=1; i<=10; i++) {
      System.out.println(A[i]);
    }
  }
}

					

Analysis of Heapsort


The analysis of the code is simple. There is a while loop which is running n times and each time it is executing 3 statements. The first two statements (swap(A[1], A[A.heap_size]) and A.heap_size = A.heap_size-1) will take a constant time but the last statement i.e., MAX-HEPAPIFY(A, 1) is going to take $O(\lg{n})$ time. So, the algorithm is going to take a total of $O(n\lg{n})$ time.

One can also argue that the analysis should also include the time of building a heap, assuming that the array we are provided is not a max-heap. Then the heapsort algorithm will first make a call to BUILD-MAX-HEAP(A) to make a max-heap and then perform the rest of the tasks. In that case also, the heapsort will take $O(n\lg{n})$ time because BUILD-MAX-HEAP will take $O(n)$ time which will be dominated by $O(n\lg{n})$.

So, that's it for the sorting algorithms. You can always check the further readings and BlogsDope sorting tag to read more about sorting. If you don't want to miss any new article from us, consider checking our BlogsDope app.

Tell me and I forget. Teach me and I remember. Involve me and I learn.
- Benjamin Franklin

Ask Yours
Post Yours
Doubt? Ask question