Determine Array Is A Subset Of Another Array Or Not In C++ Program
Determining if one array is a subset of another is a common problem in computer science. This involves checking if all elements of a candidate array are present within a larger array.
In this article, you will learn how to efficiently determine if one array is a subset of another array using C++, exploring various algorithmic approaches and their practical implementations.
Problem Statement
The core problem is to ascertain whether arr1 is a subset of arr2. This implies that every element found in arr1 must also be present in arr2. The order of elements does not affect this determination. For simplicity, we consider arr1 a subset of arr2 if each unique value present in arr1 exists at least once in arr2. We are not concerned with the count of occurrences for duplicate elements in arr1 beyond their presence.
For example, if arr1 = {10, 5, 2} and arr2 = {1, 2, 3, 4, 5, 6, 10, 11}, then arr1 is a subset of arr2. However, if arr1 = {10, 5, 2, 7} and arr2 = {1, 2, 3, 4, 5, 6, 10, 11}, arr1 is not a subset because 7 is in arr1 but not in arr2.
Example
Consider the following two arrays: arr1 = {11, 1, 13, 21, 3, 7} arr2 = {11, 3, 7, 1, 21, 13, 6, 5}
In this case, arr1 is a subset of arr2 because all elements {11, 1, 13, 21, 3, 7} from arr1 are present in arr2. The expected output for this example would be "arr1 is a subset of arr2".
Background & Knowledge Prerequisites
To understand and implement the solutions effectively, readers should be familiar with:
- C++ Basics: Fundamental syntax, data types, variables, and control flow (loops, conditionals).
- Arrays: How to declare, initialize, and access elements of arrays.
- Standard Library Containers: Basic understanding of
std::vector,std::sort,std::unordered_set. - Time and Space Complexity: General concepts for evaluating algorithm efficiency.
Relevant C++ standard library components:
#includefor input/output.#includefor dynamic arrays.#includefor sorting.#includefor hash set operations.
Use Cases or Case Studies
This problem arises in various scenarios across different domains:
- Database Query Optimization: Checking if a set of required attributes for a query is fully contained within the available indexed columns.
- Feature Flag Management: Verifying if a user's assigned feature set (array 1) contains all features required for a specific application module (array 2).
- Permissions and Access Control: Determining if a user's granted permissions (array 1) are sufficient to cover the permissions required for a specific resource or action (array 2).
- Configuration Validation: Ensuring that a partial configuration (array 1) provided by a user is entirely consistent with a predefined master configuration (array 2).
- Set Theory Operations: As a fundamental operation in implementing more complex set operations like intersection or difference.
Solution Approaches
We will explore several methods to solve this problem, ranging from naive to optimized.
Approach 1: Naive (Brute-Force) Approach
This is the simplest approach, involving iterating through each element of the potential subset array (arr1) and, for each element, linearly searching through the main array (arr2) to find a match.
- Summary: For every element in
arr1, iterate through all elements inarr2to check for its presence.
// Naive Subset Check
#include <iostream>
#include <vector>
#include <algorithm> // For std::find
using namespace std;
// Function to check if arr1 is a subset of arr2 using a naive approach
bool isSubsetNaive(const vector<int>& arr1, const vector<int>& arr2) {
// Step 1: Iterate through each element of arr1
for (int i = 0; i < arr1.size(); ++i) {
// Step 2: Assume element is not found initially
bool found = false;
// Step 3: Search for arr1[i] in arr2
for (int j = 0; j < arr2.size(); ++j) {
if (arr1[i] == arr2[j]) {
found = true;
break; // Element found, no need to search further in arr2
}
}
// Step 4: If an element from arr1 is not found in arr2, it's not a subset
if (!found) {
return false;
}
}
// Step 5: If all elements of arr1 are found in arr2, it is a subset
return true;
}
int main() {
vector<int> arr1 = {11, 1, 13, 21, 3, 7};
vector<int> arr2 = {11, 3, 7, 1, 21, 13, 6, 5};
if (isSubsetNaive(arr1, arr2)) {
cout << "arr1 is a subset of arr2 (Naive Approach)" << endl;
} else {
cout << "arr1 is not a subset of arr2 (Naive Approach)" << endl;
}
vector<int> arr3 = {10, 20, 30};
vector<int> arr4 = {10, 15, 25, 35};
if (isSubsetNaive(arr3, arr4)) {
cout << "arr3 is a subset of arr4 (Naive Approach)" << endl;
} else {
cout << "arr3 is not a subset of arr4 (Naive Approach)" << endl;
}
return 0;
}
- Sample Output:
arr1 is a subset of arr2 (Naive Approach)
arr3 is not a subset of arr4 (Naive Approach)
- Stepwise Explanation:
- The
isSubsetNaivefunction takes two constant reference vectors,arr1andarr2. - It uses an outer loop to iterate through each element (
arr1[i]) of the first array (arr1). - For each
arr1[i], an inner loop iterates through all elements (arr2[j]) of the second array (arr2). - If
arr1[i]is found inarr2, thefoundflag is set totrue, and the inner loop breaks, moving to the next element ofarr1. - If, after searching through
arr2, anarr1[i]is not found (foundremainsfalse), the function immediately returnsfalsebecausearr1cannot be a subset. - If the outer loop completes without any element of
arr1failing to be found inarr2, it means all elements are present, and the function returnstrue. - The
mainfunction demonstrates its use with both subset and non-subset examples.
Approach 2: Sorting and Two Pointers
This approach first sorts both arrays. Once sorted, we can use a two-pointer technique to efficiently check for subset relationships.
- Summary: Sort both arrays and then use two pointers to traverse them simultaneously, checking for matching elements.
// Sorted Two-Pointers Subset Check
#include <iostream>
#include <vector>
#include <algorithm> // For std::sort
using namespace std;
// Function to check if arr1 is a subset of arr2 using sorting and two pointers
bool isSubsetSorted(vector<int>& arr1, vector<int>& arr2) {
// Step 1: Sort both arrays
sort(arr1.begin(), arr1.end());
sort(arr2.begin(), arr2.end());
// Step 2: Initialize two pointers
int i = 0; // Pointer for arr1
int j = 0; // Pointer for arr2
// Step 3: Traverse both arrays using pointers
while (i < arr1.size() && j < arr2.size()) {
// If current element in arr1 is smaller than current in arr2,
// then arr1[i] cannot be in arr2 (since both are sorted), so it's not a subset.
if (arr1[i] < arr2[j]) {
return false;
}
// If elements match, advance both pointers
else if (arr1[i] == arr2[j]) {
i++;
j++;
}
// If current element in arr1 is greater than current in arr2,
// advance arr2's pointer to find a potential match for arr1[i].
else { // arr1[i] > arr2[j]
j++;
}
}
// Step 4: If all elements of arr1 have been matched (i reaches end of arr1)
return (i == arr1.size());
}
int main() {
// Note: Vectors are passed by value to be sorted in the function
// For original vectors to remain unchanged, create copies if needed.
vector<int> arr1_a = {11, 1, 13, 21, 3, 7};
vector<int> arr2_a = {11, 3, 7, 1, 21, 13, 6, 5};
if (isSubsetSorted(arr1_a, arr2_a)) {
cout << "arr1_a is a subset of arr2_a (Sorted Approach)" << endl;
} else {
cout << "arr1_a is not a subset of arr2_a (Sorted Approach)" << endl;
}
vector<int> arr1_b = {10, 20, 30};
vector<int> arr2_b = {10, 15, 25, 35};
if (isSubsetSorted(arr1_b, arr2_b)) {
cout << "arr1_b is a subset of arr2_b (Sorted Approach)" << endl;
} else {
cout << "arr1_b is not a subset of arr2_b (Sorted Approach)" << endl;
}
return 0;
}
- Sample Output:
arr1_a is a subset of arr2_a (Sorted Approach)
arr1_b is not a subset of arr2_b (Sorted Approach)
- Stepwise Explanation:
- The
isSubsetSortedfunction receives twovectorarguments by value, allowing them to be sorted locally without affecting the original vectors inmain. - Both
arr1andarr2are sorted usingstd::sort. This takes O(N log N + M log M) time, where N and M are the sizes ofarr1andarr2. - Two pointers,
iforarr1andjforarr2, are initialized to 0. - The
whileloop continues as long as both pointers are within their respective array bounds. - Inside the loop:
- If
arr1[i]is less thanarr2[j], it meansarr1[i]cannot be found inarr2(sincearr2is sorted, any element greater thanarr2[j]would also be greater thanarr1[i]), soarr1is not a subset, and the function returnsfalse.
- If
- If
arr1[i]equalsarr2[j], a match is found. Both pointersiandjare advanced to look for the next element ofarr1. - If
arr1[i]is greater thanarr2[j], it meansarr2[j]is too small to bearr1[i], so we advance pointerjto the next element inarr2to find a potential match forarr1[i].
- After the loop finishes, if
ihas reached the end ofarr1(i == arr1.size()), it means all elements ofarr1were successfully found inarr2. The function then returnstrue. Otherwise, ifihas not reached the end ofarr1, it implies some elements ofarr1were not found, andfalseis returned.
Approach 3: Using Hash Set (Unordered Set)
This is generally the most efficient approach for average-case scenarios, especially when arrays are large and unsorted. It leverages the O(1) average time complexity for insertion and lookup operations in hash sets.
- Summary: Insert all elements of the larger array (
arr2) into anstd::unordered_set. Then, iterate througharr1and check if each element exists in the set.
// Hash Set Subset Check
#include <iostream>
#include <vector>
#include <unordered_set> // For std::unordered_set
using namespace std;
// Function to check if arr1 is a subset of arr2 using a hash set
bool isSubsetHashSet(const vector<int>& arr1, const vector<int>& arr2) {
// Step 1: Insert all elements of the larger array (arr2) into a hash set
unordered_set<int> s;
for (int x : arr2) {
s.insert(x);
}
// Step 2: Iterate through arr1 and check for presence in the hash set
for (int x : arr1) {
// Step 3: If an element from arr1 is not found in the hash set, it's not a subset
if (s.find(x) == s.end()) {
return false;
}
}
// Step 4: If all elements of arr1 are found in the hash set, it is a subset
return true;
}
int main() {
vector<int> arr1_a = {11, 1, 13, 21, 3, 7};
vector<int> arr2_a = {11, 3, 7, 1, 21, 13, 6, 5};
if (isSubsetHashSet(arr1_a, arr2_a)) {
cout << "arr1_a is a subset of arr2_a (Hash Set Approach)" << endl;
} else {
cout << "arr1_a is not a subset of arr2_a (Hash Set Approach)" << endl;
}
vector<int> arr1_b = {10, 20, 30};
vector<int> arr2_b = {10, 15, 25, 35};
if (isSubsetHashSet(arr1_b, arr2_b)) {
cout << "arr1_b is a subset of arr2_b (Hash Set Approach)" << endl;
} else {
cout << "arr1_b is not a subset of arr2_b (Hash Set Approach)" << endl;
}
vector<int> arr1_c = {1, 2, 2}; // arr1 with duplicates
vector<int> arr2_c = {1, 2, 3, 4}; // arr2 contains 1 and 2
if (isSubsetHashSet(arr1_c, arr2_c)) {
cout << "arr1_c is a subset of arr2_c (Hash Set Approach - with duplicates in arr1)" << endl;
} else {
cout << "arr1_c is not a subset of arr2_c (Hash Set Approach - with duplicates in arr1)" << endl;
}
return 0;
}
- Sample Output:
arr1_a is a subset of arr2_a (Hash Set Approach)
arr1_b is not a subset of arr2_b (Hash Set Approach)
arr1_c is a subset of arr2_c (Hash Set Approach - with duplicates in arr1)
- Stepwise Explanation:
- The
isSubsetHashSetfunction takes two constant reference vectors. - An
std::unordered_set sis created. This set will store unique elements fromarr2. - All elements from
arr2are inserted into the hash set. This operation takes O(M) time on average, where M is the size ofarr2. - The function then iterates through each element
xinarr1. - For each
x, it attempts to find it in the hash set usings.find(x). Ifs.find(x)returnss.end(), it meansxis not present inarr2, soarr1is not a subset, and the function returnsfalse. - If the loop completes, it means all elements of
arr1were found inarr2's hash set representation, and the function returnstrue. - This approach handles duplicate elements in
arr1correctly based on our problem definition: ifarr1has2, 2, andarr2has2, the first2fromarr1is found, and the second2is also found, thus fulfilling the requirement that all unique values fromarr1exist inarr2.
Conclusion
Determining if one array is a subset of another is a foundational problem with multiple solutions, each offering different trade-offs in terms of time and space complexity. The naive approach is straightforward but inefficient for larger arrays. Sorting both arrays and using a two-pointer technique significantly improves efficiency. However, for the best average-case performance, especially when dealing with large or unsorted data, the hash set approach is generally preferred due to its O(N+M) average time complexity. Choosing the right approach depends on factors like array size, frequency of operations, and whether the data is already sorted.
Summary
- Problem: Check if all elements of
arr1are present inarr2. - Naive Approach:
- Iterate
arr1, for each element searcharr2. - Time Complexity: O(N\*M).
- Space Complexity: O(1).
- Sorting + Two Pointers Approach:
- Sort both
arr1andarr2. - Use two pointers to compare elements.
- Time Complexity: O(N log N + M log M).
- Space Complexity: O(1) (if sorting in-place) or O(N+M) (if creating sorted copies).
- Hash Set (Unordered Set) Approach:
- Insert all elements of
arr2into anstd::unordered_set. - Iterate
arr1, check if each element exists in the set. - Time Complexity: O(N + M) on average.
- Space Complexity: O(M) for the hash set.
- Recommendation: For optimal average-case performance and clarity, the Hash Set Approach is generally the most efficient for checking subset relationships in unsorted arrays.