Heapsort

All the three algorithms we have used so far didn't require any specific data structure but heapsort does. It is implemented using heaps. As stated earlier in this course, some of the algorithms we will study in this course will be implemented using some data structures. So, it is recommended to read articles on data structure first. However, if you are familiar with data structures, then you are good to go.

Heaps Recap

• Heaps are complete binary trees implemented using an array
• Parent of a node i is $\lfloor\frac{i}{2}\rfloor$
• Left child of the node i is $2*i$
• Right child of the node i is $(2*i)+1$
• Max-heap property → The value of a node is greater than or equal to the value of its children i.e., $A[Parent[i]] \ge A[i]$ for all nodes i>1.
• Min-heap property → The value of a node is either smaller or equal to the value of its children i.e., $A[Parent[i]] \le A[i]$ for all nodes i>1.
• HEAPIFY is a function used to maintain the heap property of any heap. It is applied to a node when the children of the node are following the property of a heap but the node itself may be violating it. It runs in $O(\lg{n})$ time.
• BUILD-HEAP is a function used to make a heap from an array. It runs in $O(n)$ time.

Working of Heapsort

Heapsort is implemented using a max-heap. In a max-heap, the maximum element is at the root of the tree and is also the first element of the array i.e., A. Now, we want this maximum element to be at the last element of our sorted array. So, we swap it with the last element and then discard this element from the heap by reducing the size of the heap. So, we have correctly put the largest element at the correct position, but in doing so, we have disturbed our heap or more precisely, the root of the heap because we haven't touched any child of the root yet. Thus they are still following the max-heap property.  We can easily deal with this problem by calling HEAPIFY on the root of the tree and that will make our tree a max-heap again, but this time the largest element won't be the part of the heap. Now, we will just repeat this process until every element of the heap is correctly placed in sorted order.    Now you have understood the concepts behind the heapsort, so let's write a code for the heapsort.

Coding Heapsort

We can easily implement the above logic in a code. We have to swap the first element of the array with the last element of the heap till the heap exists or the size of the heap is not 0.

while A.heap_size > 0
swap(A, A[heap_size])

After this, we just have to decrease the size of the heap by 1 to exclude the last element (largest element) form the heap.

A.heap_size = A.heap_size-1

The last part is to make the array a max-heap again by calling MAX-HEAPIFY function on the root because only the root is disturbed in the entire process.

MAX-HEAPIFIY(A, 1)

Take note that we are using MAX-HEAPIFY function of a heap which is explained in the data structure course.

Summing up the above code blocks, we can develop the entire code for the heapsort as:

HEAPSORT(A)
while A.heap_size > 0
swap(A, A[A.heap_size])
A.heap_size = A.heap_size-1
MAX-HEPAPIFY(A, 1)
• C
• Python
• Java
#include <stdio.h>

int tree_array_size = 11;
int heap_size = 10;

void swap( int *a, int *b ) {
int t;
t = *a;
*a = *b;
*b = t;
}

//function to get right child of a node of a tree
int get_right_child(int A[], int index) {
if((((2*index)+1) < tree_array_size) && (index >= 1))
return (2*index)+1;
return -1;
}

//function to get left child of a node of a tree
int get_left_child(int A[], int index) {
if(((2*index) < tree_array_size) && (index >= 1))
return 2*index;
return -1;
}

//function to get the parent of a node of a tree
int get_parent(int A[], int index) {
if ((index > 1) && (index < tree_array_size)) {
return index/2;
}
return -1;
}

void max_heapify(int A[], int index) {
int left_child_index = get_left_child(A, index);
int right_child_index = get_right_child(A, index);

// finding largest among index, left child and right child
int largest = index;

if ((left_child_index <= heap_size) && (left_child_index>0)) {
if (A[left_child_index] > A[largest]) {
largest = left_child_index;
}
}

if ((right_child_index <= heap_size && (right_child_index>0))) {
if (A[right_child_index] > A[largest]) {
largest = right_child_index;
}
}

// largest is not the node, node is not a heap
if (largest != index) {
swap(&A[index], &A[largest]);
max_heapify(A, largest);
}
}

void build_max_heap(int A[]) {
int i;
for(i=heap_size/2; i>=1; i--) {
max_heapify(A, i);
}
}

void heapsort(int a[]) {
while(heap_size > 0) {
swap(&a, &a[heap_size]);
heap_size = heap_size-1;
max_heapify(a, 1);
}
}

int main() {
//tree is starting from index 1 and not 0
int A[] = {0, 15, 20, 7, 9, 5, 8, 6, 10, 2, 1};
build_max_heap(A);
heapsort(A);

//printing
int i;
for(i=1; i<=10; i++) {
printf("%d\n",A[i]);
}
return 0;
}
heap_size = 10

# function to get right child of a node of a tree
def get_right_child(A, index):
if((((2*index)+1) < len(A)) and (index >= 1)):
return (2*index)+1
return -1

# function to get left child of a node of a tree
def get_left_child(A, index):
if(((2*index) < len(A)) and (index >= 1)):
return 2*index
return -1

# function to get the parent of a node of a tree
def get_parent(A, index):
if ((index > 1) and (index < len(A))):
return index//2
return -1

def max_heapify(A, index):
left_child_index = get_left_child(A, index)
right_child_index = get_right_child(A, index)

# finding largest among index, left child and right child
largest = index

if ((left_child_index <= heap_size) and (left_child_index>0)):
if (A[left_child_index] > A[largest]):
largest = left_child_index

if ((right_child_index <= heap_size and (right_child_index>0))):
if (A[right_child_index] > A[largest]):
largest = right_child_index

# largest is not the node, node is not a heap
if (largest != index):
A[index], A[largest] = A[largest], A[index]
max_heapify(A, largest)

def build_max_heap(A):
for i in range(heap_size//2, 0, -1):
max_heapify(A, i)

def heapsort(a):
global heap_size
while(heap_size > 0):
a, a[heap_size] = a[heap_size], a
heap_size = heap_size-1
max_heapify(a, 1)

if __name__ == '__main__':
# tree is starting from index 1 and not 0
A = [0, 15, 20, 7, 9, 5, 8, 6, 10, 2, 1]
build_max_heap(A)
heapsort(A)

print(A[1:11])
class Heap {
public static int heapSize = 10;

//function to get right child of a node of a tree
public static int getRightChild(int A[], int index) {
if((((2*index)+1) < A.length && (index >= 1)))
return (2*index)+1;
return -1;
}

//function to get left child of a node of a tree
public static int getLeftChild(int A[], int index) {
if(((2*index) < A.length && (index >= 1)))
return 2*index;
return -1;
}

//function to get the parent of a node of a tree
public static int getParent(int A[], int index) {
if ((index > 1) && (index < A.length)) {
return index/2;
}
return -1;
}

public static void maxHeapify(int A[], int index) {
int leftChildIndex = getLeftChild(A, index);
int rightChildIndex = getRightChild(A, index);

// finding largest among index, left child and right child
int largest = index;

if ((leftChildIndex <= heapSize) && (leftChildIndex>0)) {
if (A[leftChildIndex] > A[largest]) {
largest = leftChildIndex;
}
}

if ((rightChildIndex <= heapSize && (rightChildIndex>0))) {
if (A[rightChildIndex] > A[largest]) {
largest = rightChildIndex;
}
}

// largest is not the node, node is not a heap
if (largest != index) {
int temp;
//swapping
temp = A[largest];
A[largest] = A[index];
A[index] = temp;
maxHeapify(A, largest);
}
}

public static void buildMaxHeap(int A[]) {
int i;
for(i=heapSize/2; i>=1; i--) {
maxHeapify(A, i);
}
}

public static void heapSort(int a[]) {
while (heapSize > 0) {
int temp;
temp = a;
a = a[heapSize];
a[heapSize] = temp;
heapSize--;
maxHeapify(a, 1);
}
}

public static void main(String[] args) {
//tree is starting from index 1 and not 0
int A[] = {0, 15, 20, 7, 9, 5, 8, 6, 10, 2, 1};
buildMaxHeap(A);
heapSort(A);
for(int i=1; i<=10; i++) {
System.out.println(A[i]);
}
}
}

Analysis of Heapsort

The analysis of the code is simple. There is a while loop which is running n times and each time it is executing 3 statements. The first two statements (swap(A, A[A.heap_size]) and A.heap_size = A.heap_size-1) will take a constant time but the last statement i.e., MAX-HEPAPIFY(A, 1) is going to take $O(\lg{n})$ time. So, the algorithm is going to take a total of $O(n\lg{n})$ time.

One can also argue that the analysis should also include the time of building a heap, assuming that the array we are provided is not a max-heap. Then the heapsort algorithm will first make a call to BUILD-MAX-HEAP(A) to make a max-heap and then perform the rest of the tasks. In that case also, the heapsort will take $O(n\lg{n})$ time because BUILD-MAX-HEAP will take $O(n)$ time which will be dominated by $O(n\lg{n})$.

So, that's it for the sorting algorithms. You can always check the further readings and BlogsDope sorting tag to read more about sorting. If you don't want to miss any new article from us, consider checking our BlogsDope app.

Tell me and I forget. Teach me and I remember. Involve me and I learn.
- Benjamin Franklin  