C Program To Implement Counting Sort
Counting sort is an efficient, non-comparison-based sorting algorithm often used when numbers are in a specific range. In this article, you will learn how to implement Counting Sort in C, understand its mechanism, and explore its practical applications.
Problem Statement
Traditional comparison-based sorting algorithms like Merge Sort or Quick Sort have a lower bound of O(n log n) time complexity. However, for certain datasets, particularly when sorting integers within a limited range, these algorithms might not be the most optimal choice. The problem is to sort an array of non-negative integers efficiently when the range of these integers is not excessively large.
Example
Consider an unsorted array: [4, 2, 2, 8, 3, 3, 1]
After applying Counting Sort, the array will be sorted as: [1, 2, 2, 3, 3, 4, 8]
Background & Knowledge Prerequisites
To understand and implement Counting Sort, you should be familiar with:
- C programming basics: Variables, arrays, loops (
for,while). - Functions in C: Defining and calling functions.
- Dynamic memory allocation (optional but good for flexible arrays):
malloc,free. - Basic array operations: Accessing elements, iterating.
Use Cases or Case Studies
Counting Sort is particularly useful in scenarios where:
- Sorting Exam Scores: When grading a large number of exams, scores typically fall within a fixed range (e.g., 0-100). Counting Sort can quickly arrange scores.
- Digit Sort (Radix Sort Component): Counting Sort is often used as a subroutine in Radix Sort, where numbers are sorted digit by digit.
- Frequency Analysis: Counting the occurrences of items (e.g., characters in a string, ages in a demographic study) before sorting them based on these counts.
- Database Record Sorting: For records where a key field is an integer within a known, small range, Counting Sort can offer performance benefits.
- Histogram Generation: Efficiently generating a histogram of integer data, where the counts for each bin are needed.
Solution Approaches
Counting Sort Algorithm
Counting Sort operates by determining the frequency of each element in the input array. It then uses this frequency information to place elements into their correct sorted positions. This approach avoids direct comparisons between elements, making it highly efficient for specific data distributions.
// Counting Sort Implementation
#include <stdio.h>
#include <stdlib.h> // For malloc and free
// Function to find the maximum element in an array
int findMax(int arr[], int n) {
int max = arr[0];
for (int i = 1; i < n; i++) {
if (arr[i] > max) {
max = arr[i];
}
}
return max;
}
// Function to implement Counting Sort
void countingSort(int arr[], int n) {
// Step 1: Find the maximum element in the input array
int max = findMax(arr, n);
// Step 2: Create a count array to store the frequency of each element
// The size of count array will be max + 1 (for 0 to max)
int *count = (int *)calloc(max + 1, sizeof(int));
if (count == NULL) {
printf("Memory allocation failed for count array.\\n");
return;
}
// Step 3: Create an output array to store the sorted elements
int *output = (int *)malloc(n * sizeof(int));
if (output == NULL) {
printf("Memory allocation failed for output array.\\n");
free(count); // Clean up already allocated memory
return;
}
// Step 4: Store the count of each element
for (int i = 0; i < n; i++) {
count[arr[i]]++;
}
// Step 5: Modify the count array to store the actual position
// of each element in the output array.
// This accumulates the counts, so count[i] now stores the number
// of elements less than or equal to i.
for (int i = 1; i <= max; i++) {
count[i] += count[i - 1];
}
// Step 6: Build the output array
// Iterate from the end of the input array to maintain stability
// (if elements are equal, their relative order is preserved).
for (int i = n - 1; i >= 0; i--) {
output[count[arr[i]] - 1] = arr[i];
count[arr[i]]--; // Decrement the count for the current element
}
// Step 7: Copy the sorted elements from the output array back to the original array
for (int i = 0; i < n; i++) {
arr[i] = output[i];
}
// Step 8: Free the dynamically allocated memory
free(count);
free(output);
}
// Utility function to print an array
void printArray(int arr[], int n) {
for (int i = 0; i < n; i++) {
printf("%d ", arr[i]);
}
printf("\\n");
}
int main() {
// Example array to be sorted
int arr[] = {4, 2, 2, 8, 3, 3, 1, 0, 9, 5};
int n = sizeof(arr) / sizeof(arr[0]);
printf("Original array: ");
printArray(arr, n);
// Apply Counting Sort
countingSort(arr, n);
printf("Sorted array: ");
printArray(arr, n);
return 0;
}
Sample Output
Original array: 4 2 2 8 3 3 1 0 9 5
Sorted array: 0 1 2 2 3 3 4 5 8 9
Stepwise Explanation
- Find Maximum Element: The algorithm first determines the maximum value (
max) in the input array. This is crucial for defining the size of thecountarray, which needs to accommodate all values from 0 up tomax. - Initialize Count Array: A
countarray is created withmax + 1elements and initialized to all zeros. Each indexiin this array will store the frequency of the numberifrom the input array. - Initialize Output Array: An
outputarray of the same size as the input array is created. This array will temporarily hold the sorted elements. - Populate Count Array: The input array is traversed. For each element
arr[i], its count is incremented in thecountarray (e.g.,count[arr[i]]++). - Modify Count Array (Cumulative Sum): The
countarray is then modified to store the cumulative sum of counts. After this step,count[i]will hold the actual position of the elementiin the sortedoutputarray. Specifically, it will represent the number of elements less than or equal toi. - Build Output Array: The input array is traversed in reverse order. For each element
arr[i]:
- Its correct position in the
outputarray iscount[arr[i]] - 1. - The element
arr[i]is placed at this calculated position in theoutputarray. - The value
count[arr[i]]is then decremented. This ensures that duplicate elements are placed correctly and maintains stability.
- Copy to Original Array: Finally, the elements from the
outputarray are copied back into the original input array, making the input array sorted. - Free Memory: Dynamically allocated memory for
countandoutputarrays is freed to prevent memory leaks.
Conclusion
Counting Sort provides an efficient solution for sorting non-negative integers within a constrained range. Its time complexity of O(n + k), where 'n' is the number of elements and 'k' is the range of input values, can outperform comparison-based sorts when 'k' is not significantly larger than 'n'. However, it requires additional space proportional to 'k' and is primarily suitable for integer data.
Summary
- Counting Sort is a non-comparison sorting algorithm.
- It works by counting the occurrences of each distinct element.
- Time Complexity: O(n + k), where n is the number of elements and k is the range of input values (max element - min element + 1).
- Space Complexity: O(n + k) due to the count and output arrays.
- Best suited for sorting integers when the range of numbers (k) is relatively small compared to the number of elements (n).
- It is a stable sorting algorithm, meaning it preserves the relative order of equal elements.