Maximum Size Rectangle Binary Sub Matrix With All 1s
This article explores how to find the maximum size rectangle containing only 1s within a given 2D binary matrix. In this article, you will learn the problem, its applications, and an efficient solution approach using dynamic programming and a stack-based algorithm.
Problem Statement
Given a 2D binary matrix filled with 0s and 1s, the task is to find the largest rectangle containing only 1s and return its area. This problem often arises in scenarios where you need to identify contiguous regions of a specific property within a grid-like structure.
Consider a manufacturing defect detection system where a grid represents a product's surface. A '1' might indicate a perfect segment, and a '0' a defective one. Finding the largest rectangle of '1's helps identify the largest defect-free region.
Example
Consider the following binary matrix:
['1','0','1','0','0']
['1','0','1','1','1']
['1','1','1','1','1']
['1','0','0','1','0']
The maximum rectangle of all 1s has an area of 6, formed by the submatrix:
['1','1','1']
['1','1','1']
(from row 1, col 2 to row 2, col 4, 0-indexed)
Background & Knowledge Prerequisites
To effectively understand the solution, a basic understanding of the following concepts is helpful:
- 2D Arrays/Matrices: How to access and manipulate elements in a grid structure.
- Dynamic Programming: The concept of breaking down a problem into overlapping subproblems and storing their solutions.
- Stack Data Structure: LIFO (Last-In, First-Out) operations, particularly useful for efficiently tracking indices or values.
Use Cases or Case Studies
This problem has practical applications across various domains:
- Image Processing: Identifying the largest contiguous region of a specific color or intensity in a binary image.
- Game Development: Detecting the largest clear area for building structures or character movement on a grid-based game board.
- Circuit Design: Locating the largest empty space on a circuit board for component placement.
- Resource Allocation: In a grid representing resource availability, finding the largest block of available resources.
- Data Analysis: Identifying patterns or clusters of similar data points within a matrix representation.
Solution Approaches
While a brute-force approach (checking every possible subrectangle) is conceptually simple, its time complexity is prohibitively high. A more efficient strategy involves transforming the problem into a known subproblem: finding the largest rectangle in a histogram.
We will focus on the most efficient approach: Dynamic Programming combined with the Largest Rectangle in Histogram algorithm.
Approach: Dynamic Programming + Largest Rectangle in Histogram
This approach breaks down the 2D problem into a series of 1D problems. For each row, we imagine a histogram where the height of each bar is the count of consecutive '1's extending upwards from that position.
- Histogram Transformation:
- Initialize a
heightsarray (orhistogramarray) for the current row, with the same number of columns as the matrix.
- Initialize a
matrix[i][j]:matrix[i][j] is '1', increment heights[j]. This means the bar at j extends one unit higher.matrix[i][j] is '0', reset heights[j] to 0. This breaks the contiguous column of '1's upwards.heights for a row, the problem becomes finding the largest rectangle in *that specific histogram*.- Largest Rectangle in Histogram (Stack-based):
- This is a classic problem solvable efficiently using a monotonic stack.
One-line Summary: Transform the matrix into a series of histograms (one for each row) and apply the largest rectangle in histogram algorithm to each generated histogram, keeping track of the maximum area found.
Code Example:
// Maximum Size Rectangle Binary Sub Matrix with All 1s
#include <stdio.h>
#include <stdlib.h> // For malloc, free
#include <string.h> // For memset, memcpy
// Helper function to find the largest rectangle area in a histogram
// Uses a monotonic stack to efficiently calculate areas.
int largestRectangleArea(int* heights, int n) {
// Step 1: Initialize stack and maxArea
int* stack = (int*)malloc(n * sizeof(int));
int top = -1; // Stack top index
int maxArea = 0;
int i = 0;
// Step 2: Iterate through all bars including a sentinel at the end
while (i < n) {
// If current bar is taller than stack top, push its index
if (top == -1 || heights[stack[top]] <= heights[i]) {
stack[++top] = i++;
} else {
// If current bar is shorter, pop stack top and calculate area
int h = heights[stack[top--]]; // Height of popped bar
int w = (top == -1) ? i : (i - stack[top] - 1); // Width
maxArea = (maxArea > h * w) ? maxArea : (h * w); // Update max area
}
}
// Step 3: Pop remaining bars from stack and calculate their areas
while (top != -1) {
int h = heights[stack[top--]];
int w = (top == -1) ? i : (i - stack[top] - 1);
maxArea = (maxArea > h * w) ? maxArea : (h * w);
}
free(stack);
return maxArea;
}
// Main function to find the maximal rectangle in a binary matrix
int maximalRectangle(char** matrix, int matrixSize, int* matrixColSize) {
if (matrixSize == 0 || *matrixColSize == 0) {
return 0;
}
int n = *matrixColSize; // Number of columns
int maxArea = 0;
// Step 1: Initialize current heights array for histogram
int* heights = (int*)malloc(n * sizeof(int));
memset(heights, 0, n * sizeof(int)); // Initialize all heights to 0
// Step 2: Iterate through each row of the matrix
for (int i = 0; i < matrixSize; ++i) {
// Step 3: Update heights for the current row's histogram
for (int j = 0; j < n; ++j) {
if (matrix[i][j] == '1') {
heights[j]++; // Increment height if '1'
} else {
heights[j] = 0; // Reset height to 0 if '0'
}
}
// Step 4: Find largest rectangle in the current histogram
int currentMaxArea = largestRectangleArea(heights, n);
maxArea = (maxArea > currentMaxArea) ? maxArea : currentMaxArea; // Update overall max area
}
free(heights);
return maxArea;
}
int main() {
// Example matrix from the problem description
char* matrix1[] = {
"10100",
"10111",
"11111",
"10010"
};
int matrix1Rows = 4;
int matrix1Cols = 5;
int matrix1ColSize = 5; // All rows have 5 columns
printf("Matrix 1:\\n");
for (int i = 0; i < matrix1Rows; ++i) {
printf("%s\\n", matrix1[i]);
}
int result1 = maximalRectangle(matrix1, matrix1Rows, &matrix1ColSize);
printf("Maximum rectangle area for Matrix 1: %d\\n\\n", result1); // Expected: 6
// Another example matrix
char* matrix2[] = {
"0110",
"1111",
"1110"
};
int matrix2Rows = 3;
int matrix2Cols = 4;
int matrix2ColSize = 4;
printf("Matrix 2:\\n");
for (int i = 0; i < matrix2Rows; ++i) {
printf("%s\\n", matrix2[i]);
}
int result2 = maximalRectangle(matrix2, matrix2Rows, &matrix2ColSize);
printf("Maximum rectangle area for Matrix 2: %d\\n", result2); // Expected: 6
return 0;
}
Sample Output:
Matrix 1:
10100
10111
11111
10010
Maximum rectangle area for Matrix 1: 6
Matrix 2:
0110
1111
1110
Maximum rectangle area for Matrix 2: 6
Stepwise Explanation for Clarity:
maximalRectangleFunction:- Initializes
maxAreato 0 and allocates memory for aheightsarray (size of matrix columns), which will store the current histogram profile.
- Initializes
matrix.matrix[i][j] in the current row:matrix[i][j] is '1', it means the column of '1's extends upwards, so heights[j] is incremented.matrix[i][j] is '0', the chain of '1's is broken, so heights[j] is reset to 0.heights array, it calls largestRectangleArea with this new heights array.largestRectangleArea is compared with maxArea, and maxArea is updated if a larger rectangle is found.maxArea is returned after all rows have been processed.largestRectangleAreaFunction:- This function takes a
heightsarray (representing a histogram) and its sizen.
- This function takes a
heights array (including an implicit bar of height 0 at the end to ensure all stack elements are processed).heights[i] is pushed: If the stack is empty or heights[i] is greater than or equal to the height of the bar at the stack's top index, i is pushed onto the stack. This maintains the non-decreasing property.heights[i] is encountered that is shorter than heights[stack[top]]:heights[i] is no longer shorter than heights[stack[top]]).h at index idx):h as the height is calculated. If the stack is empty after popping, the width is i (current index). Otherwise, the width is i - stack[top] - 1 (current index minus the index of the new stack top minus one).h * w is calculated and compared with maxArea, updating maxArea if it's larger.maxArea found in this histogram is returned.Conclusion
Solving the maximal rectangle problem efficiently involves a clever reduction to the "largest rectangle in a histogram" problem. By iterating through each row and dynamically building a histogram of heights, then applying a stack-based algorithm to each histogram, we can find the largest rectangle of '1's in O(rows * columns) time complexity. This approach transforms a complex 2D geometric problem into a series of manageable 1D problems, showcasing the power of dynamic programming and appropriate data structures.
Summary
- The problem seeks the largest area submatrix of all '1's in a binary matrix.
- It's a common problem in grid-based data analysis and image processing.
- The optimal solution involves converting each row into a histogram, where bar heights represent consecutive '1's extending upwards.
- A stack-based algorithm efficiently finds the largest rectangle in each generated histogram.
- The overall time complexity is O(rows * columns) due to iterating through each cell once to build histograms and then processing each histogram using the linear-time stack algorithm.