Determine Array Is A Subset Of Another Array Or Not In C
Determining if one array is a subset of another is a common problem in computer science. This involves checking whether all elements of a potential subset array are present in a larger array, considering their frequency (multiset subset).
In this article, you will learn how to check if a given array is a subset of another array in C, exploring both a brute-force approach and a more efficient method using sorting.
Problem Statement
The core problem is to ascertain if every element in a smaller array (let's call it arr2) is also present in a larger array (let's call it arr1), taking into account the multiplicity of elements. This means if an element appears k times in arr2, it must appear at least k times in arr1. The order of elements does not matter.
For instance:
-
arr2 = {1, 2}is a subset ofarr1 = {1, 2, 3, 4}. -
arr2 = {1, 2, 2}is a subset ofarr1 = {1, 2, 2, 3}. -
arr2 = {1, 2, 2}is not a subset ofarr1 = {1, 2, 3}becausearr1only has one2.
Example
Consider two arrays: arr1 = {10, 5, 2, 23, 19} and arr2 = {19, 5, 2}.
The expected output for this case would be "arr2 is a subset of arr1" because all elements {19, 5, 2} from arr2 are found in arr1.
If arr1 = {10, 5, 2, 23, 19} and arr2 = {19, 5, 2, 7}, the expected output would be "arr2 is NOT a subset of arr1" because 7 is present in arr2 but not in arr1.
Background & Knowledge Prerequisites
To understand the solutions presented, you should be familiar with:
- C Language Basics: Variables, data types, operators.
- Arrays: Declaration, initialization, and access.
- Loops:
forandwhileloops for iteration. - Functions: Defining and calling functions.
- Pointers: Basic understanding of pointers, especially for
qsort. - Standard Library Functions: Specifically
qsortfor sorting.
Use Cases
Determining array subsets has several practical applications:
- Database Query Optimization: Checking if a sub-query's required fields are available in a larger dataset.
- Inventory Management: Verifying if all components for a product are available in the current stock.
- Access Control: Ensuring a user's permissions (subset of required permissions) are sufficient for an operation.
- Configuration Validation: Confirming that a given configuration set contains all necessary parameters.
- Dependency Checking: In software projects, ensuring all required library versions are present in the environment.
Solution Approaches
We will explore two distinct approaches to solve this problem: a straightforward brute-force method and a more efficient method leveraging sorting.
Approach 1: Brute-Force (Checking for Element Presence with Marking)
This approach iterates through each element of arr2 and tries to find a matching element in arr1. To correctly handle duplicate elements (multiset subset), once an element from arr1 is used to match an element from arr2, it should not be reused for another match. This can be achieved by marking the matched element in arr1 (e.g., by changing its value to a sentinel or using a boolean flag array).
One-line summary: Iterate through each element of the potential subset array (arr2) and search for it in the larger array (arr1), marking matched elements in arr1 to prevent reuse.
Code example:
// Brute-Force Subset Check
#include <stdio.h>
#include <stdbool.h> // For boolean type
// Function to check if arr2 is a subset of arr1 using brute-force
bool isSubsetBruteForce(int arr1[], int m, int arr2[], int n) {
if (n > m) { // If arr2 is larger than arr1, it cannot be a subset
return false;
}
// Create a boolean array to keep track of matched elements in arr1
bool matched[m];
for (int i = 0; i < m; i++) {
matched[i] = false; // Initialize all elements as not matched
}
// Iterate through each element of arr2
for (int i = 0; i < n; i++) {
bool found = false;
// Search for arr2[i] in arr1
for (int j = 0; j < m; j++) {
if (!matched[j] && arr1[j] == arr2[i]) {
matched[j] = true; // Mark as matched
found = true;
break; // Move to the next element of arr2
}
}
if (!found) {
return false; // If an element of arr2 is not found in arr1, it's not a subset
}
}
return true; // All elements of arr2 were found in arr1
}
int main() {
// Step 1: Define test cases
int arr1_a[] = {10, 5, 2, 23, 19};
int m_a = sizeof(arr1_a) / sizeof(arr1_a[0]);
int arr2_a[] = {19, 5, 2};
int n_a = sizeof(arr2_a) / sizeof(arr2_a[0]);
int arr1_b[] = {1, 2, 3, 4, 5, 6};
int m_b = sizeof(arr1_b) / sizeof(arr1_b[0]);
int arr2_b[] = {1, 2, 7};
int n_b = sizeof(arr2_b) / sizeof(arr2_b[0]);
int arr1_c[] = {1, 2, 2, 3};
int m_c = sizeof(arr1_c) / sizeof(arr1_c[0]);
int arr2_c[] = {1, 2, 2};
int n_c = sizeof(arr2_c) / sizeof(arr2_c[0]);
int arr1_d[] = {1, 2, 3};
int m_d = sizeof(arr1_d) / sizeof(arr1_d[0]);
int arr2_d[] = {1, 2, 2}; // arr2 has two 2s, arr1 has only one
int n_d = sizeof(arr2_d) / sizeof(arr2_d[0]);
// Step 2: Test the brute-force function
printf("Test Case A: arr1={10,5,2,23,19}, arr2={19,5,2}\\n");
if (isSubsetBruteForce(arr1_a, m_a, arr2_a, n_a)) {
printf("arr2 is a subset of arr1\\n\\n");
} else {
printf("arr2 is NOT a subset of arr1\\n\\n");
}
printf("Test Case B: arr1={1,2,3,4,5,6}, arr2={1,2,7}\\n");
if (isSubsetBruteForce(arr1_b, m_b, arr2_b, n_b)) {
printf("arr2 is a subset of arr1\\n\\n");
} else {
printf("arr2 is NOT a subset of arr1\\n\\n");
}
printf("Test Case C (Multiset Success): arr1={1,2,2,3}, arr2={1,2,2}\\n");
if (isSubsetBruteForce(arr1_c, m_c, arr2_c, n_c)) {
printf("arr2 is a subset of arr1\\n\\n");
} else {
printf("arr2 is NOT a subset of arr1\\n\\n");
}
printf("Test Case D (Multiset Failure): arr1={1,2,3}, arr2={1,2,2}\\n");
if (isSubsetBruteForce(arr1_d, m_d, arr2_d, n_d)) {
printf("arr2 is a subset of arr1\\n\\n");
} else {
printf("arr2 is NOT a subset of arr1\\n\\n");
}
return 0;
}
Sample output:
Test Case A: arr1={10,5,2,23,19}, arr2={19,5,2}
arr2 is a subset of arr1
Test Case B: arr1={1,2,3,4,5,6}, arr2={1,2,7}
arr2 is NOT a subset of arr1
Test Case C (Multiset Success): arr1={1,2,2,3}, arr2={1,2,2}
arr2 is a subset of arr1
Test Case D (Multiset Failure): arr1={1,2,3}, arr2={1,2,2}
arr2 is NOT a subset of arr1
Stepwise explanation:
isSubsetBruteForce(int arr1[], int m, int arr2[], int n)function:- Takes two arrays (
arr1,arr2) and their sizes (m,n) as input.
- Takes two arrays (
n > m, it immediately returns false because a larger array cannot be a subset of a smaller one.matched of size m is created and initialized to false. This array keeps track of elements in arr1 that have already been used for a match.- Outer Loop (Iterating through
arr2):- It iterates from the first element of
arr2(ifrom0ton-1).
- It iterates from the first element of
arr2[i], a found flag is set to false.- Inner Loop (Searching in
arr1):- It iterates through
arr1(jfrom0tom-1).
- It iterates through
!matched[j]: Ensures that arr1[j] has not been used for a previous match.arr1[j] == arr2[i]: Checks if the current element in arr1 matches arr2[i].arr1[j] is marked as matched[j] = true, found is set to true, and the inner loop breaks (as a match for arr2[i] has been found).- Checking
foundflag:- After the inner loop finishes, if
foundis stillfalse, it meansarr2[i]was not found inarr1(or not enough distinct occurrences were available). In this case,arr2is not a subset, and the function returnsfalse.
- After the inner loop finishes, if
- Return
true: If the outer loop completes without returningfalse, it means all elements ofarr2were successfully found and matched inarr1. The function then returnstrue.
Approach 2: Using Sorting
A more efficient method involves sorting both arrays first. Once sorted, you can use a two-pointer approach to check for the subset relationship in linear time. This method inherently handles multiset subsets correctly without extra marking arrays.
One-line summary: Sort both arrays and then use two pointers to traverse them simultaneously, checking if all elements of the potential subset array (arr2) are present in the larger array (arr1) in order.
Code example:
// Sorting-Based Subset Check
#include <stdio.h>
#include <stdlib.h> // For qsort
// Comparison function for qsort
int compare(const void *a, const void *b) {
return (*(int*)a - *(int*)b);
}
// Function to check if arr2 is a subset of arr1 using sorting
bool isSubsetSorted(int arr1[], int m, int arr2[], int n) {
if (n > m) { // If arr2 is larger than arr1, it cannot be a subset
return false;
}
// Step 1: Sort both arrays
qsort(arr1, m, sizeof(int), compare);
qsort(arr2, n, sizeof(int), compare);
// Step 2: Use two pointers to compare elements
int i = 0; // Pointer for arr1
int j = 0; // Pointer for arr2
while (i < m && j < n) {
if (arr1[i] == arr2[j]) {
// Elements match, move both pointers
i++;
j++;
} else if (arr1[i] < arr2[j]) {
// arr1[i] is smaller, so it cannot match arr2[j]
// Move arr1 pointer to find a potentially matching element
i++;
} else { // arr1[i] > arr2[j]
// arr2[j] is not found in arr1 from this point onwards
// because arr1[i] is already larger, and both are sorted.
return false;
}
}
// If j reached the end, all elements of arr2 were found in arr1
return (j == n);
}
int main() {
// Step 1: Define test cases (same as brute-force for consistency)
int arr1_a[] = {10, 5, 2, 23, 19};
int m_a = sizeof(arr1_a) / sizeof(arr1_a[0]);
int arr2_a[] = {19, 5, 2};
int n_a = sizeof(arr2_a) / sizeof(arr2_a[0]);
int arr1_b[] = {1, 2, 3, 4, 5, 6};
int m_b = sizeof(arr1_b) / sizeof(arr1_b[0]);
int arr2_b[] = {1, 2, 7};
int n_b = sizeof(arr2_b) / sizeof(arr2_b[0]);
int arr1_c[] = {1, 2, 2, 3};
int m_c = sizeof(arr1_c) / sizeof(arr1_c[0]);
int arr2_c[] = {1, 2, 2};
int n_c = sizeof(arr2_c) / sizeof(arr2_c[0]);
int arr1_d[] = {1, 2, 3};
int m_d = sizeof(arr1_d) / sizeof(arr1_d[0]);
int arr2_d[] = {1, 2, 2}; // arr2 has two 2s, arr1 has only one
int n_d = sizeof(arr2_d) / sizeof(arr2_d[0]);
// Step 2: Test the sorting-based function
printf("Test Case A (Sorted): arr1={10,5,2,23,19}, arr2={19,5,2}\\n");
if (isSubsetSorted(arr1_a, m_a, arr2_a, n_a)) {
printf("arr2 is a subset of arr1\\n\\n");
} else {
printf("arr2 is NOT a subset of arr1\\n\\n");
}
printf("Test Case B (Sorted): arr1={1,2,3,4,5,6}, arr2={1,2,7}\\n");
if (isSubsetSorted(arr1_b, m_b, arr2_b, n_b)) {
printf("arr2 is a subset of arr1\\n\\n");
} else {
printf("arr2 is NOT a subset of arr1\\n\\n");
}
printf("Test Case C (Sorted Multiset Success): arr1={1,2,2,3}, arr2={1,2,2}\\n");
if (isSubsetSorted(arr1_c, m_c, arr2_c, n_c)) {
printf("arr2 is a subset of arr1\\n\\n");
} else {
printf("arr2 is NOT a subset of arr1\\n\\n");
}
printf("Test Case D (Sorted Multiset Failure): arr1={1,2,3}, arr2={1,2,2}\\n");
if (isSubsetSorted(arr1_d, m_d, arr2_d, n_d)) {
printf("arr2 is a subset of arr1\\n\\n");
} else {
printf("arr2 is NOT a subset of arr1\\n\\n");
}
return 0;
}
Sample output:
Test Case A (Sorted): arr1={10,5,2,23,19}, arr2={19,5,2}
arr2 is a subset of arr1
Test Case B (Sorted): arr1={1,2,3,4,5,6}, arr2={1,2,7}
arr2 is NOT a subset of arr1
Test Case C (Sorted Multiset Success): arr1={1,2,2,3}, arr2={1,2,2}
arr2 is a subset of arr1
Test Case D (Sorted Multiset Failure): arr1={1,2,3}, arr2={1,2,2}
arr2 is NOT a subset of arr1
Stepwise explanation:
comparefunction:- This helper function is required by
qsort. It takes twoconst voidpointers, casts them toint, dereferences them, and returns their difference. This orders integers in ascending order.
- This helper function is required by
isSubsetSorted(int arr1[], int m, int arr2[], int n)function:- Similar to the brute-force, it first checks if
n > mand returnsfalseifarr2is larger.
- Similar to the brute-force, it first checks if
qsort to sort both arr1 and arr2 in ascending order. The time complexity for sorting is O(m log m + n log n).i for arr1 and j for arr2, are initialized to 0.while loop continues as long as both pointers are within their respective array bounds (i < m && j < n).arr1[i] == arr2[j]: A match is found. Both pointers are incremented (i++, j++) to look for the next element in both arrays.arr1[i] < arr2[j]: The current element in arr1 is smaller than the current element in arr2. This means arr1[i] cannot be a match for arr2[j]. We increment i to move past arr1[i] and look for a potentially larger element in arr1.arr1[i] > arr2[j]: The current element in arr1 is greater than arr2[j]. Since both arrays are sorted, this implies that arr2[j] (and any subsequent elements equal to arr2[j]) is not present in arr1 from this point onward. Therefore, arr2 cannot be a subset, and the function returns false.while loop, if j == n, it means all elements of arr2 have been successfully matched, and arr2 is a subset of arr1. Otherwise, if j hasn't reached n, it means some elements of arr2 were not found, and the function returns false.Conclusion
We've explored two methods to determine if one array is a multiset subset of another in C. The brute-force approach, while easy to understand, involves nested loops, leading to a time complexity of O(mn) in the worst case. The sorting-based approach is generally more efficient for larger arrays, with a time complexity dominated by the sorting step, which is O(m log m + n log n), followed by a linear O(m+n) comparison. Choosing the right approach depends on the constraints of your problem, particularly the expected size of the arrays. For practical applications with potentially large datasets, the sorting approach is preferred due to its better asymptotic performance.
Summary
- A multiset subset implies every element in the smaller array must be present in the larger array with at least the same frequency.
- Brute-Force Method:
- Iterates through each element of
arr2. - Searches for a match in
arr1, marking matched elements to handle duplicates. - Time complexity:
O(mn). - Sorting-Based Method:
- Sorts both arrays using
qsort. - Uses a two-pointer approach to efficiently compare elements in sorted order.
- Time complexity:
O(m log m + n log n)due to sorting, followed byO(m+n)for comparison. - Generally more efficient for larger arrays.