C Program For Heap Sort In Data Structure
Heap Sort is an efficient, comparison-based sorting algorithm that organizes data using a binary heap data structure. In this article, you will learn how to implement Heap Sort in C, understanding its underlying principles and practical application.
Problem Statement
Efficiently organizing data is a fundamental problem in computer science. Unsorted collections of elements can lead to slow search times and complex data manipulation. For large datasets, choosing a sorting algorithm that offers good performance consistently is crucial for system efficiency and responsiveness.
Example
Consider an unsorted array of integers: [12, 11, 13, 5, 6, 7].
After applying Heap Sort, the array will be sorted in ascending order: [5, 6, 7, 11, 12, 13].
Background & Knowledge Prerequisites
To understand Heap Sort, familiarity with the following concepts is beneficial:
- Arrays: Basic understanding of array data structures and indexing.
- Binary Trees: A conceptual understanding of trees where each node has at most two children.
- Heaps: Specifically, a Max-Heap, which is a complete binary tree where the value of each node is greater than or equal to the values of its children. The largest element is always at the root.
- Basic C Programming: Knowledge of functions, loops, conditional statements, and pointers.
Use Cases or Case Studies
Heap Sort is a versatile algorithm with several practical applications:
- Priority Queues: Heaps are the fundamental data structure for implementing priority queues, where elements are retrieved based on their priority (e.g., in operating system process scheduling).
- System Security: Used in specific algorithms for data encryption and security protocols where efficient sorting or selection of elements is required.
- External Sorting: Can be adapted for sorting very large datasets that do not fit into memory, by processing chunks of data.
- Selection Algorithms: Efficiently finding the k-th smallest or largest element in an array.
- Reliability: Unlike Quick Sort, Heap Sort has a guaranteed O(n log n) time complexity in all cases (worst, average, best), making it suitable for systems requiring predictable performance.
Solution Approaches
Heap Sort involves two main phases: building a Max-Heap from the input array and then repeatedly extracting the maximum element from the heap.
Approach 1: The heapify Function
The heapify function is critical. It ensures that a subtree rooted at a given index i satisfies the Max-Heap property. If the root is smaller than its children, it's swapped with the largest child, and the heapify process continues recursively on the affected subtree.
- One-line summary: Restores the Max-Heap property for a subtree rooted at
i.
// Heap Sort in C
#include <stdio.h>
// Function to swap two integers
void swap(int *a, int *b) {
int temp = *a;
*a = *b;
*b = temp;
}
// Function to heapify a subtree rooted with node i which is
// an index in arr[]. n is size of heap
void heapify(int arr[], int n, int i) {
int largest = i; // Initialize largest as root
int left = 2 * i + 1; // left child
int right = 2 * i + 2; // right child
// If left child is larger than root
if (left < n && arr[left] > arr[largest]) {
largest = left;
}
// If right child is larger than largest so far
if (right < n && arr[right] > arr[largest]) {
largest = right;
}
// If largest is not root
if (largest != i) {
swap(&arr[i], &arr[largest]);
// Recursively heapify the affected sub-tree
heapify(arr, n, largest);
}
}
// Main function to sort an array of given size
void heapSort(int arr[], int n) {
// Step 1: Build a max-heap (rearrange array)
// Start from the last non-leaf node and go up to the root
for (int i = n / 2 - 1; i >= 0; i--) {
heapify(arr, n, i);
}
// Step 2: One by one extract an element from heap
for (int i = n - 1; i > 0; i--) {
// Move current root to end
swap(&arr[0], &arr[i]);
// Call heapify on the reduced heap
heapify(arr, i, 0);
}
}
// Function to print an array
void printArray(int arr[], int n) {
for (int i = 0; i < n; ++i) {
printf("%d ", arr[i]);
}
printf("\\n");
}
int main() {
// Step 1: Initialize an array
int arr[] = {12, 11, 13, 5, 6, 7};
int n = sizeof(arr) / sizeof(arr[0]);
printf("Original array: ");
printArray(arr, n);
// Step 2: Perform heap sort
heapSort(arr, n);
printf("Sorted array: ");
printArray(arr, n);
return 0;
}
- Sample output:
Original array: 12 11 13 5 6 7
Sorted array: 5 6 7 11 12 13
- Stepwise explanation for clarity:
swapFunction: A simple utility to exchange values between two integer pointers.heapify(arr, n, i):
- It first assumes the current node
iis the largest. - It then calculates the indices of its left (
2*i + 1) and right (2*i + 2) children. - It compares
arr[i]with its childrenarr[left]andarr[right]to find the true largest element among them. - If the largest element is not
iitself (meaning a child was larger), it swapsarr[i]witharr[largest]. - Crucially, it then recursively calls
heapifyon the subtree rooted atlargest(where the originalarr[i]value has moved) to ensure that subtree also maintains the Max-Heap property.
Approach 2: The heapSort Function
The heapSort function orchestrates the entire sorting process using heapify. It first builds a Max-Heap from the entire array and then repeatedly extracts the maximum element and re-heapifies the remaining elements.
- One-line summary: Builds a Max-Heap and then repeatedly extracts the root (largest element) to sort the array.
- Stepwise explanation for clarity:
- Build a Max-Heap:
- The loop
for (int i = n / 2 - 1; i >= 0; i--)iterates from the last non-leaf node up to the root (index 0). - For each such node
i,heapify(arr, n, i)is called. This process ensures that by the time the loop finishes, the entire array is structured as a Max-Heap, with the largest element atarr[0].
- Extract Elements and Sort:
- The second loop
for (int i = n - 1; i > 0; i--)iterates from the end of the array backwards. - In each iteration:
- The largest element (currently at
arr[0]) is swapped with the element atarr[i]. This effectively places the largest element into its correct sorted position at the end of the unsorted part of the array. -
heapify(arr, i, 0)is then called. Thenparameter forheapifyis nowi, indicating that the heap size has been reduced by one (the last element is now sorted). This call re-establishes the Max-Heap property for the remainingielements, bringing the next largest element toarr[0]. - This process continues until all elements are in their sorted positions.
Conclusion
Heap Sort is an in-place sorting algorithm with a time complexity of O(n log n) in all cases (best, average, worst), making it a reliable choice for various applications. It leverages the heap data structure to efficiently manage and extract the largest elements, resulting in a robust and predictable sorting mechanism.
Summary
- Heap Sort utilizes a binary heap to sort an array.
- It operates in two main phases: building a Max-Heap and repeatedly extracting the maximum element.
- The
heapifyfunction is central to maintaining the Max-Heap property. - Heap Sort has a consistent O(n log n) time complexity.
- It is an in-place sorting algorithm, requiring minimal additional memory.
- Commonly used for priority queues and scenarios requiring guaranteed performance.