C Program For Merge Sort Using Divide And Conquer
Merge Sort is a highly efficient, comparison-based sorting algorithm that utilizes the divide and conquer paradigm. In this article, you will learn how to implement Merge Sort in C, understanding its underlying principles and practical application.
Problem Statement
The fundamental problem is to efficiently sort a given array of elements into a specific order (ascending or descending). For large datasets, naive sorting algorithms become prohibitively slow, necessitating more advanced techniques like Merge Sort that offer better time complexity.
Example
Consider an unsorted array: [38, 27, 43, 3, 9, 82, 10]
After applying Merge Sort, the array will be sorted in ascending order: [3, 9, 10, 27, 38, 43, 82]
Background & Knowledge Prerequisites
To understand and implement Merge Sort effectively, readers should have a grasp of:
- Basic C Programming: Arrays, functions, loops, and conditional statements.
- Recursion: Merge Sort inherently relies on recursive function calls.
- Divide and Conquer Paradigm: The conceptual approach of breaking down a problem into smaller, similar subproblems, solving them, and then combining their solutions.
Use Cases or Case Studies
Merge Sort is a versatile algorithm used in various scenarios due to its stability and guaranteed O(n log n) time complexity.
- External Sorting: When data to be sorted is too large to fit into RAM, Merge Sort can process chunks of data, sort them, and then merge the sorted chunks from disk.
- Parallel Sorting: Its divide and conquer nature makes it well-suited for parallel implementations, as subarrays can be sorted independently.
- Inversion Count Problem: Used as a building block for algorithms that count inversions in an array.
- Linked List Sorting: Unlike Quick Sort, Merge Sort can sort linked lists efficiently without requiring random access to elements, making it an ideal choice.
- Stable Sorting Requirement: When the relative order of equal elements must be preserved, Merge Sort is preferred over algorithms like Quick Sort.
Solution Approaches
Merge Sort works by recursively dividing an unsorted list into n sublists, each containing one element (a list of one element is considered sorted), and then repeatedly merging sublists to produce new sorted sublists until there is only one sorted list remaining.
Approach 1: The Divide Step (mergeSort function)
The mergeSort function implements the "divide" part of the algorithm. It takes an array and divides it into two halves, then recursively calls itself on each half until individual elements are reached.
Summary: This function recursively splits the array into two halves until each subarray contains only one element.
// Merge Sort Divide Step
#include <stdio.h> // Included for main function demonstration, not strictly needed for just mergeSort prototype
// Function to recursively sort the array using mergeSort
void mergeSort(int arr[], int left, int right) {
if (left < right) {
// Find the middle point
int mid = left + (right - left) / 2;
// Sort first and second halves
mergeSort(arr, left, mid);
mergeSort(arr, mid + 1, right);
// Merge the sorted halves
// The actual merge function will be defined separately
}
}
/*
// Example of how mergeSort would be called:
int main() {
int arr[] = {38, 27, 43, 3, 9, 82, 10};
int n = sizeof(arr) / sizeof(arr[0]);
mergeSort(arr, 0, n - 1);
// Print sorted array
return 0;
}
*/
Stepwise Explanation:
- Base Case: The recursion stops when
left >= right, meaning the subarray has one or zero elements, which is inherently sorted. - Find Midpoint: Calculates the middle index
midto divide the array into two halves. Usingleft + (right - left) / 2prevents potential integer overflow compared to(left + right) / 2for very largeleftandrightvalues. - Recursive Calls:
mergeSortis called for the left half (arr, left, mid) and then for the right half (arr, mid + 1, right). This continues until single-element arrays are produced. - Merge: After the recursive calls return (meaning the subarrays
[left...mid]and[mid+1...right]are sorted), amergefunction is conceptually called to combine these two sorted halves.
Approach 2: The Conquer Step (merge function)
The merge function is the core of Merge Sort, responsible for combining two sorted subarrays into a single sorted array.
Summary: This function takes two sorted subarrays and merges them into a single sorted array.
// Merge Sort Conquer Step
#include <stdio.h>
#include <stdlib.h> // For malloc and free
// Function to merge two subarrays of arr[]
// First subarray is arr[left..mid]
// Second subarray is arr[mid+1..right]
void merge(int arr[], int left, int mid, int right) {
int i, j, k;
int n1 = mid - left + 1;
int n2 = right - mid;
// Create temporary arrays
int *L = (int*) malloc(n1 * sizeof(int));
int *R = (int*) malloc(n2 * sizeof(int));
// Copy data to temp arrays L[] and R[]
for (i = 0; i < n1; i++)
L[i] = arr[left + i];
for (j = 0; j < n2; j++)
R[j] = arr[mid + 1 + j];
// Merge the temp arrays back into arr[left..right]
i = 0; // Initial index of first subarray
j = 0; // Initial index of second subarray
k = left; // Initial index of merged subarray
while (i < n1 && j < n2) {
if (L[i] <= R[j]) {
arr[k] = L[i];
i++;
} else {
arr[k] = R[j];
j++;
}
k++;
}
// Copy the remaining elements of L[], if any
while (i < n1) {
arr[k] = L[i];
i++;
k++;
}
// Copy the remaining elements of R[], if any
while (j < n2) {
arr[k] = R[j];
j++;
k++;
}
// Free the dynamically allocated memory
free(L);
free(R);
}
/*
// Example of how merge would be called (after mergeSort's recursive calls):
int main() {
int arr[] = {2, 5, 7, 1, 3, 6}; // Assume {2,5,7} and {1,3,6} are already sorted halves
int left = 0, mid = 2, right = 5;
// merge(arr, left, mid, right); // This would sort the combined array
return 0;
}
*/
Stepwise Explanation:
- Calculate Sizes:
n1andn2store the sizes of the two subarrays. - Create Temp Arrays: Two temporary arrays,
LandR, are dynamically allocated to hold the elements of the left and right subarrays, respectively. - Copy Data: Elements from the original array
arrare copied intoLandR. - Merge Back: Using three pointers (
iforL,jforR, andkforarr), elements are compared fromLandR. The smaller element is placed intoarr[k], and its respective pointer (iorj) is incremented. Thekpointer is always incremented. - Copy Remaining Elements: After one of the temporary arrays is exhausted, any remaining elements in the other array are copied directly to
arr. - Free Memory: The dynamically allocated memory for
LandRis freed to prevent memory leaks.
Approach 3: Full Merge Sort Implementation
Combining the mergeSort and merge functions, we get the complete Merge Sort algorithm.
Summary: This is the complete C program for Merge Sort, integrating both the recursive divide and iterative conquer steps to sort an array.
// Merge Sort using Divide and Conquer
#include <stdio.h>
#include <stdlib.h> // For malloc and free
// Function to merge two subarrays of arr[]
// First subarray is arr[left..mid]
// Second subarray is arr[mid+1..right]
void merge(int arr[], int left, int mid, int right) {
int i, j, k;
int n1 = mid - left + 1;
int n2 = right - mid;
// Create temporary arrays
int *L = (int*) malloc(n1 * sizeof(int));
int *R = (int*) malloc(n2 * sizeof(int));
// Copy data to temp arrays L[] and R[]
for (i = 0; i < n1; i++)
L[i] = arr[left + i];
for (j = 0; j < n2; j++)
R[j] = arr[mid + 1 + j];
// Merge the temp arrays back into arr[left..right]
i = 0; // Initial index of first subarray
j = 0; // Initial index of second subarray
k = left; // Initial index of merged subarray
while (i < n1 && j < n2) {
if (L[i] <= R[j]) {
arr[k] = L[i];
i++;
} else {
arr[k] = R[j];
j++;
}
k++;
}
// Copy the remaining elements of L[], if any
while (i < n1) {
arr[k] = L[i];
i++;
k++;
}
// Copy the remaining elements of R[], if any
while (j < n2) {
arr[k] = R[j];
j++;
k++;
}
// Free the dynamically allocated memory
free(L);
free(R);
}
// Function to recursively sort the array using mergeSort
void mergeSort(int arr[], int left, int right) {
if (left < right) {
// Find the middle point
int mid = left + (right - left) / 2;
// Sort first and second halves
mergeSort(arr, left, mid);
mergeSort(arr, mid + 1, right);
// Merge the sorted halves
merge(arr, left, mid, right);
}
}
// Utility function to print an array
void printArray(int arr[], int size) {
int i;
for (i = 0; i < size; i++)
printf("%d ", arr[i]);
printf("\\n");
}
// Driver program to test the merge sort implementation
int main() {
int arr[] = {38, 27, 43, 3, 9, 82, 10};
int n = sizeof(arr) / sizeof(arr[0]);
printf("Given array is \\n");
printArray(arr, n);
mergeSort(arr, 0, n - 1);
printf("Sorted array is \\n");
printArray(arr, n);
return 0;
}
Sample Output:
Given array is
38 27 43 3 9 82 10
Sorted array is
3 9 10 27 38 43 82
Stepwise Explanation:
- Main Function: Initializes an unsorted integer array and determines its size.
- Print Original: Displays the unsorted array to the console.
- Call Merge Sort: Invokes
mergeSortwith the array and its initial and final indices (0andn-1). - Recursive Division:
mergeSortrecursively divides the array until single-element subarrays are formed. - Merging: As the recursive calls return, the
mergefunction is called at each step to combine the sorted subarrays, gradually building up the fully sorted array. - Print Sorted: After
mergeSortcompletes, theprintArrayfunction displays the now-sorted array.
Conclusion
Merge Sort is a robust and efficient sorting algorithm, particularly valuable for large datasets and scenarios requiring stable sorting. Its consistent O(n log n) time complexity, regardless of the input data arrangement, makes it a reliable choice for various applications. The divide and conquer strategy allows for a clear, recursive breakdown of the sorting problem, which is then solved by the careful merging of sorted subarrays.
Summary
- Merge Sort is a comparison-based sorting algorithm.
- It operates on the divide and conquer principle.
- Time Complexity: O(n log n) in all cases (best, average, worst).
- Space Complexity: O(n) due to the temporary arrays used during merging.
- It is a stable sorting algorithm, preserving the relative order of equal elements.
- Well-suited for external sorting and parallel processing.
- Involves two main steps: dividing the array recursively and merging sorted subarrays.