C++ Program For Merge Sort Algorithm In Data Structure
Sorting large datasets efficiently is a fundamental task in computer science. Algorithms like Merge Sort provide a robust and predictable way to organize data, offering significant advantages in various applications. In this article, you will learn how the Merge Sort algorithm works, its implementation in C++, and its practical applications.
Problem Statement
The challenge is to sort an unsorted list of elements (e.g., numbers, strings) into a specific order (ascending or descending) efficiently. While simple sorting algorithms exist, many become inefficient as the size of the dataset grows, leading to unacceptable processing times, especially with millions or billions of items.
Example
Consider an unsorted array of integers: [38, 27, 43, 3, 9, 82, 10].
After applying Merge Sort, the array should be sorted in ascending order: [3, 9, 10, 27, 38, 43, 82].
Background & Knowledge Prerequisites
To understand and implement Merge Sort in C++, you should have a basic understanding of:
- C++ Basics: Variables, data types, functions, arrays.
- Recursion: Understanding how functions call themselves and the concept of a base case.
- Pointers and Arrays: How arrays are passed to functions in C++.
Use Cases or Case Studies
Merge Sort is a powerful algorithm suitable for various scenarios:
- External Sorting: When data is too large to fit into memory, Merge Sort can efficiently sort chunks of data, then merge the sorted chunks.
- Inversion Count Problem: It can be modified to count inversions in an array, which is useful in data analysis.
- Implementing Custom Comparators: Its stable nature makes it ideal for sorting complex objects based on multiple criteria without disturbing the relative order of equal elements.
- Parallel Processing: The divide-and-conquer nature of Merge Sort makes it highly amenable to parallelization, where different sub-arrays can be sorted concurrently.
- Guaranteed Performance: Unlike some other sorting algorithms (like Quick Sort) that can degrade to O(N^2) in worst-case scenarios, Merge Sort consistently performs at O(N log N).
Solution Approaches
Merge Sort is a "divide and conquer" algorithm. It recursively divides the array into halves until each sub-array contains a single element (which is by definition sorted). Then, it repeatedly merges these sub-arrays to produce new sorted sub-arrays until the entire array is sorted.
Merge Sort Algorithm
This approach involves two main functions: mergeSort which handles the division, and merge which handles the combination of sorted sub-arrays.
One-line summary: Recursively divides an unsorted list into n sub-lists, each containing one element, then repeatedly merges sub-lists to produce new sorted sub-lists until there is only one sorted list.
Code Example:
// Merge Sort Algorithm
#include <iostream>
#include <vector> // Using vector for dynamic arrays and ease of use
// Function to merge two halves of an array
void merge(std::vector<int>& arr, int left, int mid, int right) {
int n1 = mid - left + 1;
int n2 = right - mid;
// Create temporary vectors
std::vector<int> L(n1);
std::vector<int> R(n2);
// Copy data to temporary vectors L[] and R[]
for (int i = 0; i < n1; i++)
L[i] = arr[left + i];
for (int j = 0; j < n2; j++)
R[j] = arr[mid + 1 + j];
// Merge the temporary vectors back into arr[left..right]
int i = 0; // Initial index of first sub-vector
int j = 0; // Initial index of second sub-vector
int k = left; // Initial index of merged array
while (i < n1 && j < n2) {
if (L[i] <= R[j]) {
arr[k] = L[i];
i++;
} else {
arr[k] = R[j];
j++;
}
k++;
}
// Copy the remaining elements of L[], if any
while (i < n1) {
arr[k] = L[i];
i++;
k++;
}
// Copy the remaining elements of R[], if any
while (j < n2) {
arr[k] = R[j];
j++;
k++;
}
}
// Function to implement Merge Sort
void mergeSort(std::vector<int>& arr, int left, int right) {
if (left >= right) { // Base case: if array has 0 or 1 element
return;
}
int mid = left + (right - left) / 2;
mergeSort(arr, left, mid); // Sort first half
mergeSort(arr, mid + 1, right); // Sort second half
merge(arr, left, mid, right); // Merge the sorted halves
}
// Helper function to print the array
void printArray(const std::vector<int>& arr) {
for (int i = 0; i < arr.size(); i++) {
std::cout << arr[i] << " ";
}
std::cout << std::endl;
}
int main() {
// Step 1: Initialize an unsorted vector
std::vector<int> arr = {38, 27, 43, 3, 9, 82, 10};
std::cout << "Original array: ";
printArray(arr);
// Step 2: Apply Merge Sort
mergeSort(arr, 0, arr.size() - 1);
// Step 3: Print the sorted array
std::cout << "Sorted array: ";
printArray(arr);
return 0;
}
Sample Output:
Original array: 38 27 43 3 9 82 10
Sorted array: 3 9 10 27 38 43 82
Stepwise Explanation for Clarity:
mergeSort(arr, left, right)Function:
- Base Case: If
leftis greater than or equal toright, it means the sub-array has zero or one element, which is already sorted. The recursion stops here. - Divide: Calculates the middle index (
mid) to split the current sub-array into two halves. - Conquer (Recursive Calls): Recursively calls
mergeSortfor the left half (arr, left, mid) and the right half (arr, mid + 1, right). This process continues until all sub-arrays are broken down to single elements. - Combine: Once the recursive calls return (meaning the sub-arrays are sorted), it calls the
mergefunction to combine the two sorted halves back into a single sorted sub-array.
merge(arr, left, mid, right)Function:
- This function takes two sorted sub-arrays
arr[left...mid]andarr[mid+1...right]and merges them into a single sorted sub-array within the originalarr. - Temporary Arrays: It creates two temporary vectors,
LandR, to hold the elements of the left and right sub-arrays, respectively. - Merging: It uses three pointers (
iforL,jforR, andkforarrstarting atleft). It compares the elements pointed to byiandj. The smaller element is placed intoarr[k], and its respective pointer (iorj) is incremented, along withk. - Remaining Elements: After one of the temporary arrays is exhausted, any remaining elements in the other temporary array are simply copied back into
arr.
Conclusion
Merge Sort is a highly efficient and stable sorting algorithm with a consistent time complexity of O(N log N) in all cases (best, average, and worst). Its divide-and-conquer approach makes it suitable for large datasets and external sorting, while its stability preserves the relative order of equal elements.
Summary
- Merge Sort is a divide and conquer sorting algorithm.
- It has a time complexity of O(N log N) in all cases (best, average, worst).
- It is a stable sorting algorithm, preserving the relative order of equal elements.
- The algorithm involves two main phases: splitting the array recursively and merging sorted sub-arrays.
- It requires O(N) auxiliary space for temporary arrays during the merging process.
- Use cases include external sorting, parallel processing, and situations requiring stable sorts.