C++ Program To Print Unique Rows In A Given Boolean Matrix
Working with boolean matrices often involves tasks like identifying unique patterns or configurations. When dealing with such data, a common requirement is to extract only the distinct rows, eliminating any duplicates. This can be crucial in areas like image processing, database management, or even game development for board state analysis.
In this article, you will learn how to efficiently print unique rows in a given boolean matrix using C++, exploring different algorithmic approaches.
Problem Statement
The problem requires us to process a boolean matrix (a 2D array where elements are either 0 or 1) and output each unique row exactly once. If multiple rows in the matrix are identical, only the first occurrence (or any single occurrence) should be printed. This challenge often arises when analyzing patterns, filtering redundant data, or optimizing storage for unique configurations.
Consider a boolean matrix of size R rows and C columns. We need an algorithm to identify and print each row that is not identical to any row already printed.
Example
Given the following boolean matrix:
0 1 0 0
1 0 1 1
0 1 0 0
1 1 0 0
The unique rows printed should be:
0 1 0 0
1 0 1 1
1 1 0 0
Notice that the row 0 1 0 0 appears twice in the input, but is printed only once in the output.
Background & Knowledge Prerequisites
To understand and implement the solutions discussed, readers should have a basic understanding of:
- C++ Fundamentals: Variables, data types, loops (for, while), conditional statements (if-else).
- Arrays and 2D Arrays (Matrices): How to declare, initialize, and access elements in multi-dimensional arrays.
- Standard Template Library (STL): Familiarity with containers like
std::vector(for dynamic arrays),std::set(for storing unique elements), and basic string manipulation. - Pointers and Dynamic Memory Allocation (for Trie): A basic grasp of pointers and
new/deletefor creating linked data structures.
Use Cases or Case Studies
Identifying unique rows in a boolean matrix has several practical applications:
- Image Processing: In binary image analysis, unique rows can represent distinct patterns or features in an image, helping to identify objects or textures.
- Database Systems: When storing sparse data or binary flags, duplicate rows might indicate redundant entries that can be filtered out for efficiency or data integrity.
- Game Development: Analyzing game board states in turn-based games (e.g., chess, Go) to detect unique configurations and avoid re-evaluating known states.
- Bioinformatics: Identifying unique patterns in binary sequences representing genetic markers or protein structures.
- Digital Circuit Design: In logic circuit minimization, boolean matrices can represent truth tables, and unique rows might correspond to essential prime implicants.
Solution Approaches
Here we will explore two distinct approaches to print unique rows in a boolean matrix: using a hash set with string conversion, and using a Trie data structure.
Approach 1: Using a Hash Set (std::set of strings)
This approach converts each row into a string (or a unique integer representation if column count is small) and stores these string representations in a std::set. A std::set automatically handles uniqueness; if an insertion is successful (the element was not already present), the row is unique and can be printed.
- Summary: Convert each row into a string representation and store these strings in a
std::set. If a row's string representation can be successfully inserted into the set, it's unique and is printed.
// Print Unique Rows using Set of Strings
#include <iostream>
#include <vector>
#include <string>
#include <set>
using namespace std;
// Function to print unique rows
void printUniqueRows(const vector<vector<int>>& matrix) {
int R = matrix.size();
if (R == 0) return; // Handle empty matrix
int C = matrix[0].size();
set<string> uniqueRowsSet;
for (int i = 0; i < R; ++i) {
string currentRowStr = "";
for (int j = 0; j < C; ++j) {
currentRowStr += to_string(matrix[i][j]);
}
// Attempt to insert the string representation of the current row
// If insertion is successful, it means the row is unique
if (uniqueRowsSet.find(currentRowStr) == uniqueRowsSet.end()) {
uniqueRowsSet.insert(currentRowStr);
// Print the unique row
for (int j = 0; j < C; ++j) {
cout << matrix[i][j] << " ";
}
cout << endl;
}
}
}
int main() {
// Step 1: Define the boolean matrix
vector<vector<int>> matrix = {
{0, 1, 0, 0},
{1, 0, 1, 1},
{0, 1, 0, 0},
{1, 1, 0, 0}
};
cout << "Unique rows in the matrix:" << endl;
// Step 2: Call the function to print unique rows
printUniqueRows(matrix);
// Another example matrix
cout << "\\nUnique rows for another matrix:" << endl;
vector<vector<int>> matrix2 = {
{1, 1},
{0, 1},
{1, 1},
{0, 0}
};
printUniqueRows(matrix2);
return 0;
}
- Sample Output:
Unique rows in the matrix:
0 1 0 0
1 0 1 1
1 1 0 0
Unique rows for another matrix:
1 1
0 1
0 0
- Stepwise Explanation:
- Initialize
uniqueRowsSet: Create an emptystd::setto store the string representations of unique rows encountered so far. - Iterate Through Rows: Loop through each row of the input matrix.
- Convert Row to String: For each row, concatenate its boolean (0/1) values into a single string. For example,
[0, 1, 0, 0]becomes"0100". - Check for Uniqueness: Use
uniqueRowsSet.find(currentRowStr) == uniqueRowsSet.end()to check if the current row's string representation is already present in the set. - Insert and Print: If the string is not found (meaning it's a unique row), insert it into the
uniqueRowsSetand then print the original row's elements to the console. - Repeat: Continue this process for all rows in the matrix.
Approach 2: Using a Trie (Prefix Tree)
A Trie (pronounced "try") or prefix tree is an efficient data structure for storing and retrieving strings or sequences. For boolean matrices, each path from the root to a leaf (or an "end-of-row" marker) in the Trie can represent a unique row.
- Summary: Build a Trie where each path represents a row. Each node in the Trie has two children (for 0 and 1). When inserting a row, if a new path is formed (i.e., a new node is created or an "end-of-row" marker is set for the first time), the row is unique and printed.
// Print Unique Rows using Trie
#include <iostream>
#include <vector>
#include <memory> // For std::unique_ptr
using namespace std;
// Trie Node structure
struct TrieNode {
unique_ptr<TrieNode> children[2]; // children[0] for '0', children[1] for '1'
bool isEndOfRow; // Marks if a path ending at this node represents a complete row
TrieNode() : isEndOfRow(false) {
children[0] = nullptr;
children[1] = nullptr;
}
};
// Function to insert a row into the Trie and check for uniqueness
// Returns true if the row was unique (i.e., newly inserted), false otherwise
bool insert(TrieNode* root, const vector<int>& row) {
TrieNode* curr = root;
bool isNewRow = false; // Flag to check if we created any new nodes
for (int bit : row) {
if (curr->children[bit] == nullptr) {
curr->children[bit] = make_unique<TrieNode>();
isNewRow = true; // A new path segment was created
}
curr = curr->children[bit].get();
}
// If isNewRow is true, it means we created at least one new node,
// making the entire path (row) unique until this point.
// Also, if the current node's isEndOfRow was false, but now we mark it true,
// that also signifies a unique *complete* row.
if (!curr->isEndOfRow) {
curr->isEndOfRow = true;
isNewRow = true; // This specific row is now marked as complete for the first time
}
return isNewRow;
}
// Function to print unique rows using Trie
void printUniqueRowsTrie(const vector<vector<int>>& matrix) {
int R = matrix.size();
if (R == 0) return; // Handle empty matrix
// No need for C, as vector<int> row can handle different lengths (though typical for matrices to be uniform)
// Create the root of the Trie
TrieNode* root = new TrieNode(); // Using raw pointer for root, manage memory with smart pointers for children
for (const auto& row : matrix) {
// If insert returns true, it means this row was unique
if (insert(root, row)) {
for (int bit : row) {
cout << bit << " ";
}
cout << endl;
}
}
// Clean up dynamically allocated root node
// For simplicity in this example, manual delete for root.
// In a full application, a smart pointer or a wrapper class for Trie would handle this.
delete root;
}
int main() {
// Step 1: Define the boolean matrix
vector<vector<int>> matrix = {
{0, 1, 0, 0},
{1, 0, 1, 1},
{0, 1, 0, 0},
{1, 1, 0, 0}
};
cout << "Unique rows in the matrix (using Trie):" << endl;
// Step 2: Call the function to print unique rows using Trie
printUniqueRowsTrie(matrix);
// Another example matrix
cout << "\\nUnique rows for another matrix (using Trie):" << endl;
vector<vector<int>> matrix2 = {
{1, 1},
{0, 1},
{1, 1},
{0, 0},
{1, 0}
};
printUniqueRowsTrie(matrix2);
return 0;
}
- Sample Output:
Unique rows in the matrix (using Trie):
0 1 0 0
1 0 1 1
1 1 0 0
Unique rows for another matrix (using Trie):
1 1
0 1
0 0
1 0
- Stepwise Explanation:
- Define
TrieNode: Create astruct TrieNodethat contains an array of twounique_ptr(for children representing 0 and 1) and a boolean flagisEndOfRowto mark if a complete row ends at this node.unique_ptrensures automatic memory management for children nodes. - Initialize Trie Root: Create a
TrieNodeinstance to serve as the root of the Trie. insertFunction:
- Start traversing from the
root. - For each
bitin the current row: - If the child node corresponding to
bitdoes not exist, create it (make_unique). This indicates a new path segment, so set() isNewRow = true. - Move to the child node (
curr = curr->children[bit].get()). - After traversing all bits in the row, check
curr->isEndOfRow. If it'sfalse, it means this specific row has not been inserted before. Markcurr->isEndOfRow = trueand setisNewRow = true. - Return
isNewRowto indicate if the row was truly unique.
- Iterate and Print: Loop through each
rowin the matrix. Call theinsertfunction. If it returnstrue, print the current row. - Memory Management: Ensure the root node (and potentially any manually allocated nodes) are properly deallocated after use to prevent memory leaks.
unique_ptrhandles child nodes automatically.
Conclusion
Identifying and printing unique rows in a boolean matrix is a common task with diverse applications. The choice of algorithm depends largely on the constraints and characteristics of the input matrix.
The hash set approach (using std::set) offers simplicity and is often sufficient for matrices where row length isn't excessively large, as string conversions and comparisons can add overhead. Its ease of implementation makes it a good first choice.
The Trie approach, while more complex to implement, provides a more efficient solution for larger matrices, especially those with many rows or potentially long rows, as it leverages prefix sharing and avoids repeated string conversions. It's generally preferred for performance-critical scenarios.
Summary
- Problem: Identify and print distinct rows from a boolean matrix.
- Approach 1 (Hash Set):
- Convert each row into a string (e.g., "0100").
- Store these strings in a
std::set. - If a row's string representation is newly inserted into the set, print the row.
- Pros: Easy to implement.
- Cons: String conversion and comparison overhead can be significant for very long rows.
- Approach 2 (Trie):
- Build a Trie data structure where paths represent rows.
- Each Trie node has two children (0 and 1).
- When traversing to insert a row, if a new path segment or an "end-of-row" marker is set for the first time, the row is unique.
- Pros: Efficient for dense matrices and many rows; avoids string conversion overhead.
- Cons: More complex to implement.
- Use Cases: Image processing, database filtering, game state analysis, bioinformatics.