C Program For Heap Sort Using Recursion
Sorting data efficiently is a fundamental task in computer science, crucial for optimizing search operations, database management, and more. Among various sorting algorithms, heap sort stands out for its efficiency, offering a worst-case time complexity of O(n log n). In this article, you will learn how to implement heap sort using recursion in C, leveraging the power of binary heaps.
Problem Statement
The challenge is to arrange a given list of unsorted elements into ascending (or descending) order using the heap sort algorithm. This method relies on the heap data structure, a specialized tree-based structure that satisfies the heap property. Specifically, for ascending order, we will build a max-heap, where the value of each parent node is greater than or equal to the values of its children.
Example
Consider the following array of integers: [12, 11, 13, 5, 6, 7]
After applying heap sort, the array will be sorted in ascending order: [5, 6, 7, 11, 12, 13]
Background & Knowledge Prerequisites
To understand heap sort, readers should be familiar with:
- Arrays: Basic data structures for storing collections of elements.
- Binary Trees: Tree data structures where each node has at most two children.
- Heaps: A specific type of binary tree that satisfies the heap property (max-heap for sorting in ascending order, min-heap for descending).
- Recursion: A programming technique where a function calls itself to solve a smaller instance of the same problem.
- Basic C Syntax: Variables, loops, functions, and array manipulation in C.
For implementation, you will need a C compiler (like GCC) and a text editor.
Use Cases or Case Studies
Heap sort finds its application in various scenarios due to its stability and efficient worst-case performance:
- System Software: Used in operating systems for process scheduling, where priorities might be managed using a heap.
- Graph Algorithms: Often employed as a priority queue implementation in algorithms like Dijkstra's or Prim's, especially when dealing with large datasets.
- External Sorting: While not its primary use, the heap structure can be adapted for external sorting of data that doesn't fit into memory.
- Sorting Large Datasets: Its O(n log n) worst-case time complexity makes it suitable for sorting large arrays where predictable performance is critical.
- "Top K" Problems: Efficiently finding the K largest or smallest elements in a dataset without fully sorting the entire set.
Solution Approaches
The core of heap sort involves two main steps: building a max-heap from the input array and then repeatedly extracting the maximum element from the heap. Both steps leverage a recursive helper function called heapify.
Recursive Heap Sort
This approach involves two main phases: building a max-heap and then extracting elements one by one. The heapify function ensures the heap property is maintained recursively.
The process for sorting an array arr of size n is as follows:
- Build a Max-Heap: Transform the input array into a max-heap. This is done by starting from the last non-leaf node and calling
heapifyon all nodes up to the root. - Extract Elements: Repeatedly extract the maximum element (which is always at the root of the max-heap) and place it at the end of the array. After extraction, the heap size is reduced, and the new root is heapified to restore the max-heap property.
Code Example
// Heap Sort using Recursion
#include <stdio.h>
// Function to maintain the heap property in a subtree rooted at index i.
// n is the size of the heap.
void heapify(int arr[], int n, int i) {
// Step 1: Initialize 'largest' as the root of the current subtree
int largest = i;
// Calculate indices of left and right children
int left = 2 * i + 1; // Left child index
int right = 2 * i + 2; // Right child index
// Step 2: Check if the left child exists and is greater than the current largest
if (left < n && arr[left] > arr[largest]) {
largest = left;
}
// Step 3: Check if the right child exists and is greater than the current largest
if (right < n && arr[right] > arr[largest]) {
largest = right;
}
// Step 4: If the largest element is not the current root, swap them
// and recursively heapify the affected subtree.
if (largest != i) {
int temp = arr[i];
arr[i] = arr[largest];
arr[largest] = temp;
// Recursively call heapify on the subtree that was affected by the swap
heapify(arr, n, largest);
}
}
// Main function to perform heap sort on an array of size n
void heapSort(int arr[], int n) {
// Step 1: Build a max-heap (rearrange the array)
// We start from the last non-leaf node (n/2 - 1) and go up to the root (0).
// This ensures that when heapify is called on a node, its children are already heaps.
for (int i = n / 2 - 1; i >= 0; i--) {
heapify(arr, n, i);
}
// Step 2: Extract elements one by one from the heap
// After building the heap, the largest element is at arr[0].
// We swap it with the last element, reduce the heap size, and re-heapify.
for (int i = n - 1; i > 0; i--) {
// Move the current root (largest element) to the end of the array
int temp = arr[0];
arr[0] = arr[i];
arr[i] = temp;
// Call heapify on the reduced heap (excluding the last element which is now sorted)
// to maintain the max-heap property for the remaining elements.
heapify(arr, i, 0);
}
}
// Utility function to print an array
void printArray(int arr[], int n) {
for (int i = 0; i < n; ++i) {
printf("%d ", arr[i]);
}
printf("\\n");
}
int main() {
// Step 1: Define an example array to be sorted
int arr[] = {12, 11, 13, 5, 6, 7};
int n = sizeof(arr) / sizeof(arr[0]);
printf("Original array: ");
printArray(arr, n);
// Step 2: Perform heap sort using the defined function
heapSort(arr, n);
printf("Sorted array: ");
printArray(arr, n);
return 0;
}
Sample Output
Original array: 12 11 13 5 6 7
Sorted array: 5 6 7 11 12 13
Stepwise Explanation
heapify(arr[], n, i)Function:
- This function is responsible for ensuring the subtree rooted at index
i(and its children) satisfies the max-heap property. - It initializes
largesttoi, assuming the root is the largest. - It calculates the indices of the
left(2 * i + 1) andright(2 * i + 2) children. - It compares the element at
iwith its left and right children. If a child is larger,largestis updated to that child's index. - If
largestis no longeri(meaning a child was larger), the element atarr[i]is swapped witharr[largest]. - Crucially,
heapifyis then called recursively on the subtree where the swap occurred (heapify(arr, n, largest)) to ensure that portion of the heap also remains a max-heap after the change.
heapSort(arr[], n)Function:
- Building the Max-Heap: The first
forloop (for (int i = n / 2 - 1; i >= 0; i--)) iterates from the last non-leaf node (n/2 - 1) up to the root (0). For each node,heapifyis called. This process effectively converts the entire array into a max-heap. Whenheapifyis called for a node, its children (if any) are already valid heaps due to the bottom-up approach. - Extracting Elements: The second
forloop (for (int i = n - 1; i > 0; i--)) extracts elements one by one. - In a max-heap, the largest element is always at
arr[0]. - This largest element is swapped with the last element of the current heap (
arr[i]). - The heap size is then logically reduced by one (by considering the
heapifycall'snparameter asi), effectively placing the largest element in its sorted position at the end of the array. -
heapify(arr, i, 0)is called to restore the max-heap property for the remainingielements, with the new (potentially smaller) element at the root. This process repeats until the entire array is sorted.
main()Function:
- An example integer array
arris initialized. - The size
nof the array is calculated. - The
originalarray is printed. -
heapSortis called to sort the array. - The
sortedarray is printed.
Conclusion
Heap sort is a robust and efficient comparison-based sorting algorithm, particularly valuable for its O(n log n) worst-case time complexity, making its performance predictable even with highly disordered inputs. By leveraging the max-heap data structure and a recursive heapify function, we can systematically build a heap and extract elements to achieve a sorted array. The recursive nature of heapify simplifies the logic for maintaining the heap property after element manipulations.
Summary
- Heap sort is an efficient, comparison-based sorting algorithm with O(n log n) time complexity.
- It operates by building a max-heap from the input array.
- The
heapifyfunction recursively ensures that a subtree satisfies the max-heap property. - The main
heapSortfunction first builds the heap bottom-up, then repeatedly extracts the largest element (root) and re-heapifies the remaining elements. - Heap sort is useful in scenarios requiring predictable performance, such as system scheduling and large dataset sorting.