C Program For Merge Sort Using Recursion
Merge Sort is a highly efficient, comparison-based sorting algorithm crucial for handling large datasets. In this article, you will learn how to implement Merge Sort in C using a recursive approach, understanding its core principles and practical application.
Problem Statement
Efficiently organizing data is a fundamental task in computer science. Sorting an array of elements means arranging them in a specific order (e.g., ascending or descending). While simple sorting algorithms exist, their performance degrades significantly with large inputs. The problem is to sort an array of n elements with optimal time complexity, ensuring stability and robust performance.
Example
Consider an unsorted array of integers. After applying Merge Sort, the elements will be arranged in ascending order.
Unsorted Array:
[38, 27, 43, 3, 9, 82, 10]
Sorted Array (Output):
[3, 9, 10, 27, 38, 43, 82]
Background & Knowledge Prerequisites
To understand this implementation, readers should be familiar with:
- C Programming Basics: Variables, data types, loops (for, while), conditional statements (if-else).
- Arrays: Declaring, initializing, and accessing elements.
- Functions: Defining and calling functions, passing arguments.
- Recursion: Understanding base cases, recursive calls, and how functions call themselves.
- Pointers (basic): How arrays are passed to functions in C (decaying to pointers).
Use Cases or Case Studies
Merge Sort is a versatile algorithm used in various scenarios:
- External Sorting: When data is too large to fit into memory, Merge Sort can process parts of it from disk and merge them.
- Linked Lists: It's particularly efficient for sorting linked lists because it doesn't require random access to elements, making in-place merging more straightforward.
- Parallel Sorting: Its divide-and-conquer nature makes it well-suited for parallel processing environments where different sub-arrays can be sorted concurrently.
- Stable Sorting: Merge Sort is a stable sorting algorithm, meaning that elements with equal values maintain their relative order in the sorted output. This property is critical in database systems or complex data processing pipelines.
- Inversion Count Problem: Merge Sort can be modified to count inversions in an array, which is a measure of how far an array is from being sorted.
Solution Approaches
The primary approach for Merge Sort involves a recursive divide-and-conquer strategy. It breaks down an unsorted list into n sublists, each containing one element (a list of one element is considered sorted). Then, it repeatedly merges sublists to produce new sorted sublists until there is only one sorted list remaining.
Recursive Merge Sort Implementation
This approach uses two main functions: mergeSort for the recursive division and merge for combining the sorted sub-arrays.
Summary:
The mergeSort function recursively divides the array into two halves until individual elements are reached. The merge function then combines these sorted halves into a single, larger sorted array.
Code Example:
// Merge Sort using Recursion
#include <stdio.h>
#include <stdlib.h> // For malloc and free
// Function to merge two halves of an array
void merge(int arr[], int left, int mid, int right) {
int i, j, k;
int n1 = mid - left + 1;
int n2 = right - mid;
// Create temporary arrays
int *L = (int *)malloc(n1 * sizeof(int));
int *R = (int *)malloc(n2 * sizeof(int));
// Copy data to temporary arrays L[] and R[]
for (i = 0; i < n1; i++)
L[i] = arr[left + i];
for (j = 0; j < n2; j++)
R[j] = arr[mid + 1 + j];
// Merge the temporary arrays back into arr[left..right]
i = 0; // Initial index of first sub-array
j = 0; // Initial index of second sub-array
k = left; // Initial index of merged sub-array
while (i < n1 && j < n2) {
if (L[i] <= R[j]) {
arr[k] = L[i];
i++;
} else {
arr[k] = R[j];
j++;
}
k++;
}
// Copy the remaining elements of L[], if any
while (i < n1) {
arr[k] = L[i];
i++;
k++;
}
// Copy the remaining elements of R[], if any
while (j < n2) {
arr[k] = R[j];
j++;
k++;
}
// Free the dynamically allocated memory
free(L);
free(R);
}
// Main function that implements Merge Sort
void mergeSort(int arr[], int left, int right) {
if (left < right) {
// Same as (left + right) / 2, but avoids overflow for large left and right
int mid = left + (right - left) / 2;
// Sort first and second halves
mergeSort(arr, left, mid);
mergeSort(arr, mid + 1, right);
// Merge the sorted halves
merge(arr, left, mid, right);
}
}
// Utility function to print an array
void printArray(int arr[], int size) {
int i;
for (i = 0; i < size; i++)
printf("%d ", arr[i]);
printf("\\n");
}
int main() {
// Step 1: Initialize an unsorted array
int arr[] = {38, 27, 43, 3, 9, 82, 10};
int n = sizeof(arr) / sizeof(arr[0]);
// Step 2: Print the original array
printf("Original array: ");
printArray(arr, n);
// Step 3: Call mergeSort to sort the array
mergeSort(arr, 0, n - 1);
// Step 4: Print the sorted array
printf("Sorted array: ");
printArray(arr, n);
return 0;
}
Sample Output:
Original array: 38 27 43 3 9 82 10
Sorted array: 3 9 10 27 38 43 82
Stepwise Explanation:
mergeSort(arr, left, right)Function:
- Base Case: The recursion stops when
left >= right. This means the sub-array has one or zero elements, which is inherently sorted. - Divide: It calculates the middle index
midto divide the array into two halves. - Conquer (Recursive Calls): It recursively calls
mergeSortfor the left half (arr[left...mid]) and the right half (arr[mid+1...right]). This continues until the base case is reached for all sub-arrays. - Combine: After the recursive calls return (meaning the left and right halves are sorted), it calls the
mergefunction to combine these two sorted halves into a single sorted sub-array.
merge(arr, left, mid, right)Function:
- Sub-array Sizes: It calculates
n1(size of the left sub-array) andn2(size of the right sub-array). - Temporary Arrays: It creates two temporary arrays,
LandR, to hold the elements of the left and right sub-arrays, respectively. This is crucial for the merging process without overwriting data prematurely. - Copy Data: Elements from
arr[left...mid]are copied intoL, and elements fromarr[mid+1...right]are copied intoR. - Merging:
- It uses three pointers:
iforL,jforR, andkfor the mainarr(starting atleft). - It compares elements
L[i]andR[j]. The smaller element is placed intoarr[k], and its respective pointer (iorj) andkare incremented. - This continues until one of the temporary arrays is exhausted.
- Copy Remaining Elements: Any remaining elements in
LorR(if one sub-array was longer) are copied directly intoarr. - Memory Management: It frees the dynamically allocated memory for
LandRto prevent memory leaks.
Conclusion
Merge Sort provides an efficient and stable way to sort data, making it a cornerstone algorithm in computer science. Its recursive, divide-and-conquer strategy ensures a time complexity of O(N log N) in all cases (best, average, worst), which is superior to many other comparison-based sorts for larger datasets.
Summary
- Merge Sort is a recursive, divide-and-conquer sorting algorithm.
- It works by dividing an array into halves, recursively sorting each half, and then merging the sorted halves.
- The
mergefunction is the core, combining two sorted sub-arrays into a single sorted array. - It has an optimal time complexity of O(N log N) in all cases.
- Merge Sort is a stable sorting algorithm, preserving the relative order of equal elements.
- It often requires O(N) auxiliary space for temporary arrays during the merging process.