Sorting Elements Of An Array By Frequency In C++ Program
Sorting elements by frequency is a common requirement in data analysis and algorithm design. This article will guide you through effective C++ programming techniques to sort an array where elements appear in descending order of their frequencies.
Problem Statement
Given an array of integers, the task is to sort its elements based on their frequencies. Elements that appear more often should come before elements that appear less often. If two elements have the same frequency, they should be sorted in ascending order of their values. This problem arises when analyzing data distributions, prioritizing frequently occurring items, or optimizing caches.
Example
Consider the following input array:
Input Array: [2, 5, 2, 8, 5, 6, 8, 8]
The expected output after sorting by frequency would be:
Output Array: [8, 8, 8, 2, 2, 5, 5, 6]
Here, '8' appears 3 times, '2' appears 2 times, '5' appears 2 times, and '6' appears 1 time. '8' comes first due to highest frequency. '2' and '5' have the same frequency (2), but '2' comes before '5' because 2 < 5.
Background & Knowledge Prerequisites
To effectively understand the solutions, readers should have a basic understanding of:
- C++ fundamentals: variables, loops, arrays.
- Standard Template Library (STL) containers:
std::vector,std::map,std::unordered_map. - STL algorithms:
std::sort. - Custom comparators for sorting (lambda expressions or functor classes).
std::pairfor storing key-value pairs.
Relevant includes for the solutions will typically be:
#include <iostream> // For input/output operations
#include <vector> // For dynamic arrays
#include <map> // For storing frequencies in sorted order of keys
#include <unordered_map> // For storing frequencies without key order
#include <algorithm> // For std::sort
Use Cases or Case Studies
Sorting elements by frequency is a fundamental operation with various practical applications:
- Data Analysis: Identifying the most frequent items in a dataset, like top-selling products, most common words in a text, or popular search queries.
- Compression Algorithms: In Huffman coding or other data compression techniques, frequently occurring symbols are assigned shorter codes, requiring frequency analysis.
- Cache Optimization: Evicting least frequently used (LFU) items or prioritizing frequently accessed items in a cache system to improve performance.
- Recommendation Systems: Recommending items based on their popularity or how often users interact with them.
- Network Traffic Analysis: Detecting popular destinations or services by analyzing the frequency of IP addresses or port numbers to identify potential bottlenecks or unusual patterns.
Solution Approaches
We will explore two primary approaches using C++ STL containers:
- Using
std::mapto count frequencies, then sortingstd::vectorof pairs. - Using
std::unordered_mapto count frequencies, then sortingstd::vectorof pairs.
Approach 1: Using std::map and std::vector (with Custom Comparator)
This approach uses a std::map to count the frequency of each element. std::map stores elements in sorted order of keys, but this is not directly useful for frequency-based sorting. We then transfer these element-frequency pairs into a std::vector and sort it using a custom comparator.
- One-line summary: Count frequencies with
std::map, transfer to a vector of pairs, then sort the vector using a lambda that prioritizes frequency (descending) and then element value (ascending).
// Sort Array by Frequency (using std::map)
#include <iostream>
#include <vector>
#include <map>
#include <algorithm> // For std::sort
using namespace std;
// Custom comparator function for sorting pairs
bool comparePairs(const pair<int, int>& a, const pair<int, int>& b) {
// Sort by frequency in descending order
if (a.second != b.second) {
return a.second > b.second;
}
// If frequencies are same, sort by value in ascending order
return a.first < b.first;
}
int main() {
// Step 1: Define the input array
vector<int> arr = {2, 5, 2, 8, 5, 6, 8, 8};
cout << "Original array: ";
for (int x : arr) {
cout << x << " ";
}
cout << endl;
// Step 2: Count frequencies of each element using a map
map<int, int> frequencyMap;
for (int x : arr) {
frequencyMap[x]++;
}
// Step 3: Transfer map contents to a vector of pairs
// Each pair will store {element, frequency}
vector<pair<int, int>> frequencyVector;
for (auto const& [element, freq] : frequencyMap) {
frequencyVector.push_back({element, freq});
}
// Step 4: Sort the vector of pairs using the custom comparator
sort(frequencyVector.begin(), frequencyVector.end(), comparePairs);
// Step 5: Construct the sorted result array
vector<int> sortedArr;
for (const auto& p : frequencyVector) {
// Add the element 'p.first' 'p.second' times to the result
for (int i = 0; i < p.second; ++i) {
sortedArr.push_back(p.first);
}
}
// Step 6: Print the sorted array
cout << "Sorted array by frequency: ";
for (int x : sortedArr) {
cout << x << " ";
}
cout << endl;
return 0;
}
- Sample Output:
Original array: 2 5 2 8 5 6 8 8
Sorted array by frequency: 8 8 8 2 2 5 5 6
- Stepwise Explanation:
- Initialize: An input
std::vectoris created. - Count Frequencies: A
std::mapis used to store element-frequency pairs. The map's keys are the array elements, and values are their counts. Iterating through the input array and incrementingfrequencyMap[x]efficiently populates this map. - Transfer to Vector: The contents of
frequencyMapare then transferred to astd::vector>. Eachstd::pairholds an element and its corresponding frequency. This step is necessary becausestd::mapdoes not allow custom sorting directly based on values (frequencies). - Custom Sort: The
std::vectorof pairs is sorted usingstd::sortalong with a custom comparison function (comparePairs). This function ensures that:- Pairs with higher frequencies come first (descending frequency).
- If frequencies are equal, pairs with smaller element values come first (ascending element value).
- Reconstruct Array: Finally, a new
std::vectoris constructed by iterating through the sortedfrequencyVector. For each pair{element, frequency}, theelementis addedfrequencytimes to the new vector. - Print Result: The sorted array is printed.
Approach 2: Using std::unordered_map and std::vector (with Custom Comparator)
This approach is very similar to the first, but it uses std::unordered_map instead of std::map for counting frequencies. std::unordered_map provides average O(1) time complexity for insertions and lookups, which can be faster than std::map's O(log N) for large datasets. However, elements are not stored in any particular order in std::unordered_map.
- One-line summary: Count frequencies with
std::unordered_map, transfer to a vector of pairs, then sort the vector using a lambda that prioritizes frequency (descending) and then element value (ascending).
// Sort Array by Frequency (using std::unordered_map)
#include <iostream>
#include <vector>
#include <unordered_map> // For unordered_map
#include <algorithm> // For std::sort
using namespace std;
// Custom comparator function (can also be a lambda for brevity in main)
bool comparePairsUnordered(const pair<int, int>& a, const pair<int, int>& b) {
if (a.second != b.second) {
return a.second > b.second; // Sort by frequency in descending order
}
return a.first < b.first; // If frequencies are same, sort by value in ascending order
}
int main() {
// Step 1: Define the input array
vector<int> arr = {2, 5, 2, 8, 5, 6, 8, 8};
cout << "Original array: ";
for (int x : arr) {
cout << x << " ";
}
cout << endl;
// Step 2: Count frequencies of each element using an unordered_map
unordered_map<int, int> frequencyUnorderedMap;
for (int x : arr) {
frequencyUnorderedMap[x]++;
}
// Step 3: Transfer unordered_map contents to a vector of pairs
vector<pair<int, int>> frequencyVector;
for (auto const& [element, freq] : frequencyUnorderedMap) {
frequencyVector.push_back({element, freq});
}
// Step 4: Sort the vector of pairs using the custom comparator
sort(frequencyVector.begin(), frequencyVector.end(), comparePairsUnordered);
// Step 5: Construct the sorted result array
vector<int> sortedArr;
for (const auto& p : frequencyVector) {
for (int i = 0; i < p.second; ++i) {
sortedArr.push_back(p.first);
}
}
// Step 6: Print the sorted array
cout << "Sorted array by frequency: ";
for (int x : sortedArr) {
cout << x << " ";
}
cout << endl;
return 0;
}
- Sample Output:
Original array: 2 5 2 8 5 6 8 8
Sorted array by frequency: 8 8 8 2 2 5 5 6
- Stepwise Explanation:
- Initialize: An input
std::vectoris created. - Count Frequencies: A
std::unordered_mapis used to store element-frequency pairs. This step is identical to Approach 1, exceptunordered_mapprovides average O(1) performance for insertion/access. - Transfer to Vector: The contents of
frequencyUnorderedMapare transferred to astd::vector>. - Custom Sort: The
std::vectorof pairs is sorted usingstd::sortwith a custom comparison function (comparePairsUnordered), which behaves identically to the comparator in Approach 1. - Reconstruct Array: A new
std::vectoris built by adding each elementfrequencytimes, based on the sorted pairs. - Print Result: The sorted array is printed.
Conclusion
Sorting an array by frequency is a common problem addressed effectively using C++ STL containers and algorithms. Both std::map and std::unordered_map provide efficient ways to count frequencies. The choice between them often depends on performance requirements, where std::unordered_map generally offers faster average-case performance for counting due to its hash-table implementation, while std::map provides guaranteed logarithmic time complexity and keeps keys sorted. Regardless of the counting mechanism, a std::vector of pairs combined with std::sort and a custom comparator is crucial for applying the desired sorting logic based on frequency and then value.
Summary
- Problem: Sort an array by element frequency (descending), with ties broken by element value (ascending).
- Counting Frequencies: Use
std::map(O(log N) per operation) orstd::unordered_map(average O(1) per operation) to store element-frequency pairs. - Data Transfer: Convert the map/unordered\_map into a
std::vector>for sorting. - Custom Sorting: Employ
std::sortwith a custom comparator (e.g., a lambda function or a separate function) that first compares frequencies in descending order and then element values in ascending order for tie-breaking. - Reconstruction: Iterate through the sorted
std::vectorto rebuild the final sorted array, appending each element as many times as its frequency.