Pigeonhole Algorithm In C++ Program

Last Updated :13 Jan, 2026

Problem Statement

Consider a scenario where you have a collection of items, and you need to determine if any item is repeated, or if a certain condition must inevitably arise due to the number of items versus the available categories. A common problem is finding a duplicate number in an array of n integers where all integers are within the range [1, n-1]. Since there are n numbers but only n-1 possible unique values, the pigeonhole principle guarantees that at least one number must be duplicated.

Example

Imagine you have an array [1, 3, 4, 2, 2]. This array has 5 elements, and all elements are in the range [1, 4]. According to the pigeonhole principle (with 5 pigeons and 4 pigeonholes), there must be at least one duplicate. In this example, the number 2 appears twice. The output of an algorithm applying this principle would confirm the presence of a duplicate.

Background & Knowledge Prerequisites

To understand and implement algorithms based on the pigeonhole principle in C++, a solid grasp of these concepts is essential:

C++ Basics: Variables, data types, conditional statements (if-else), loops (for, while).
Arrays: Declaring, initializing, and iterating through arrays.
Functions: Defining and calling functions.
Basic Data Structures: Understanding of how arrays can be used as frequency maps or presence indicators.
The Pigeonhole Principle: The core idea that if you have more items (pigeons) than categories (pigeonholes) to put them into, at least one category must contain more than one item.

Use Cases or Case Studies

The pigeonhole principle, directly or indirectly, underpins solutions in several computational scenarios:

Finding Duplicates in a Bounded Range: This is the most direct application. If you have n numbers in the range [1, n-1], a duplicate must exist.
Detecting Collisions in Hash Tables: While not a direct implementation, the principle explains why collisions are inevitable when the number of items to hash exceeds the number of available hash buckets.
Memory Optimization: Using an array as a boolean or frequency map (where array indices are pigeonholes) can be more memory-efficient than a hash map for small, contiguous ranges.
Proving Existence: It can be used in theoretical computer science to prove the existence of certain properties within a data set without explicitly finding them.
Bit Manipulation Problems: Sometimes, constraints on bit patterns can imply duplicates or specific conditions based on the principle.

Solution Approaches

Let's explore several approaches to solve the problem of finding a duplicate in an array of n integers, where all integers are in the range [1, n-1].

1. Brute-Force Search

This approach uses nested loops to compare every element with every other element. While simple, its efficiency is low.

Summary: Iterates through all possible pairs of elements to check for equality.

Code Example:

// Brute-Force Duplicate Finder
    #include <iostream>
    #include <vector>
    #include <numeric> // For std::iota (not used in this specific example but good to include)

    int findDuplicateBruteForce(const std::vector<int>& nums) {
        // Step 1: Iterate through each element
        for (size_t i = 0; i < nums.size(); ++i) {
            // Step 2: Compare with all subsequent elements
            for (size_t j = i + 1; j < nums.size(); ++j) {
                // Step 3: If a duplicate is found, return it
                if (nums[i] == nums[j]) {
                    return nums[i];
                }
            }
        }
        // Step 4: If no duplicate found (shouldn't happen with problem constraints)
        return -1; 
    }

    int main() {
        std::vector<int> nums = {1, 3, 4, 2, 2}; // n = 5, range [1, 4]
        int duplicate = findDuplicateBruteForce(nums);

        if (duplicate != -1) {
            std::cout << "The duplicate number is: " << duplicate << std::endl;
        } else {
            std::cout << "No duplicate found." << std::endl;
        }

        std::vector<int> nums2 = {3, 1, 3, 4, 2}; // Another example
        duplicate = findDuplicateBruteForce(nums2);
        if (duplicate != -1) {
            std::cout << "The duplicate number is: " << duplicate << std::endl;
        } else {
            std::cout << "No duplicate found." << std::endl;
        }

        return 0;
    }

Run

Sample Output:

The duplicate number is: 2
The duplicate number is: 3

Stepwise Explanation:

The findDuplicateBruteForce function takes a vector of integers.
It uses an outer loop to pick an element nums[i].
An inner loop then compares nums[i] with all elements nums[j] that appear after nums[i] in the array.
If nums[i] is equal to nums[j], a duplicate is found, and its value is returned.
If the loops complete without finding a duplicate (which, given the problem constraints, shouldn't happen), it returns -1.

2. Sorting and Linear Scan

Sorting the array brings identical elements next to each other, making duplicate detection a simple linear scan.

Summary: Sorts the array, then iterates through the sorted array to find adjacent identical elements.

Code Example:

// Sorting-Based Duplicate Finder
    #include <iostream>
    #include <vector>
    #include <algorithm> // For std::sort

    int findDuplicateSorting(std::vector<int>& nums) {
        // Step 1: Sort the array
        std::sort(nums.begin(), nums.end());

        // Step 2: Iterate through the sorted array and check adjacent elements
        for (size_t i = 0; i < nums.size() - 1; ++i) {
            // Step 3: If an element is equal to its successor, it's a duplicate
            if (nums[i] == nums[i+1]) {
                return nums[i];
            }
        }
        // Step 4: If no duplicate found
        return -1;
    }

    int main() {
        std::vector<int> nums = {1, 3, 4, 2, 2};
        int duplicate = findDuplicateSorting(nums);

        if (duplicate != -1) {
            std::cout << "The duplicate number is: " << duplicate << std::endl;
        } else {
            std::cout << "No duplicate found." << std::endl;
        }

        std::vector<int> nums2 = {3, 1, 3, 4, 2};
        duplicate = findDuplicateSorting(nums2);
        if (duplicate != -1) {
            std::cout << "The duplicate number is: " << duplicate << std::endl;
        } else {
            std::cout << "No duplicate found." << std.endl;
        }

        return 0;
    }

Run

Sample Output:

The duplicate number is: 2
The duplicate number is: 3

Stepwise Explanation:

The findDuplicateSorting function first sorts the input nums vector using std::sort.
After sorting, it iterates through the array from the first element up to the second-to-last element.
In each iteration, it compares the current element nums[i] with its immediate successor nums[i+1].
If they are equal, a duplicate is found and returned.
If the loop finishes without finding any identical adjacent elements, it returns -1.

3. Using an Auxiliary Array (Pigeonholes)

This approach directly applies the pigeonhole principle by using an auxiliary array (acting as pigeonholes) to mark the presence of each number.

Summary: Creates a boolean or frequency array where indices correspond to numbers. When a number is encountered, its corresponding index in the auxiliary array is marked. If an index is already marked, a duplicate is found.

Code Example:

// Pigeonhole Principle Duplicate Finder
    #include <iostream>
    #include <vector>
    #include <numeric>

    int findDuplicatePigeonhole(const std::vector<int>& nums) {
        // Step 1: Determine the range of numbers. Given numbers are [1, n-1].
        // So, we need 'n' pigeonholes (from index 0 to n-1 or 1 to n-1, adjusted).
        // If numbers are 1 to max_val, we need a boolean array of size max_val + 1.
        // For problem: n numbers in range [1, n-1]. The size of array is nums.size().
        // Max value is nums.size() - 1. So, array of size nums.size().
        std::vector<bool> seen(nums.size(), false); // Pigeonholes initialized to empty

        // Step 2: Iterate through the input numbers (pigeons)
        for (int num : nums) {
            // Step 3: Check if this pigeonhole (index 'num') is already occupied
            if (seen[num]) {
                // Step 4: If it's occupied, we found a duplicate
                return num;
            }
            // Step 5: Mark this pigeonhole as occupied
            seen[num] = true;
        }
        // Step 6: This line should ideally not be reached given problem constraints
        return -1;
    }

    int main() {
        std::vector<int> nums = {1, 3, 4, 2, 2}; // n=5, range [1,4]. Max val 4. Seen size 5.
        int duplicate = findDuplicatePigeonhole(nums);

        if (duplicate != -1) {
            std::cout << "The duplicate number is: " << duplicate << std::endl;
        } else {
            std::cout << "No duplicate found." << std::endl;
        }

        std::vector<int> nums2 = {3, 1, 3, 4, 2}; // n=5, range [1,4]
        duplicate = findDuplicatePigeonhole(nums2);
        if (duplicate != -1) {
            std::cout << "The duplicate number is: " << duplicate << std::endl;
        } else {
            std::cout << "No duplicate found." << std::endl;
        }

        return 0;
    }

Run

Sample Output:

The duplicate number is: 2
The duplicate number is: 3

Stepwise Explanation:

A boolean vector seen is created, with its size equal to nums.size(). This vector acts as our "pigeonholes," where each index i represents the number i. All pigeonholes are initially marked as empty (false).
The algorithm then iterates through each number (num) in the input nums vector. These are our "pigeons."
For each num, it checks if the corresponding pigeonhole (seen[num]) is already marked true.
If seen[num] is true, it means we have encountered this number before, so num is a duplicate, and it is returned.
If seen[num] is false, the pigeonhole is marked as occupied (seen[num] = true) to indicate that we have seen this number.
Given the problem constraints (N numbers in range [1, N-1]), a duplicate is guaranteed, so the loop will always find and return one.

4. Using a Hash Map

A hash map (like std::unordered_set or std::unordered_map) provides a similar "pigeonhole" mechanism by quickly storing and checking for the presence of elements.

Summary: Uses a hash set to store encountered numbers. If a number is already in the set, it's a duplicate.

Code Example:

// Hash Map Duplicate Finder
    #include <iostream>
    #include <vector>
    #include <unordered_set> // For std::unordered_set

    int findDuplicateHashMap(const std::vector<int>& nums) {
        // Step 1: Create an unordered set to store unique numbers encountered
        std::unordered_set<int> seen_numbers;

        // Step 2: Iterate through the input numbers
        for (int num : nums) {
            // Step 3: Check if the number is already in the set
            if (seen_numbers.count(num)) {
                // Step 4: If yes, it's a duplicate
                return num;
            }
            // Step 5: If not, add it to the set
            seen_numbers.insert(num);
        }
        // Step 6: Should not be reached under problem constraints
        return -1;
    }

    int main() {
        std::vector<int> nums = {1, 3, 4, 2, 2};
        int duplicate = findDuplicateHashMap(nums);

        if (duplicate != -1) {
            std::cout << "The duplicate number is: " << duplicate << std::endl;
        } else {
            std::cout << "No duplicate found." << std::endl;
        }

        std::vector<int> nums2 = {3, 1, 3, 4, 2};
        duplicate = findDuplicateHashMap(nums2);
        if (duplicate != -1) {
            std::cout << "The duplicate number is: " << duplicate << std::endl;
        } else {
            std::cout << "No duplicate found." << std::endl;
        }

        return 0;
    }

Run

Sample Output:

The duplicate number is: 2
The duplicate number is: 3

Stepwise Explanation:

An std::unordered_set named seen_numbers is initialized. This set will store all unique numbers encountered so far.
The function iterates through each number num in the input nums vector.
For each num, it checks if num is already present in seen_numbers using seen_numbers.count(num).
If count(num) returns 1 (meaning it's present), then num is a duplicate, and it is returned.
If num is not in the set, it's a new unique number, so it's added to seen_numbers using seen_numbers.insert(num).
Similar to the auxiliary array approach, a duplicate is guaranteed, so a number will always be returned.

Conclusion

The pigeonhole principle provides a powerful way to reason about the existence of specific conditions within a data set, especially when dealing with bounded ranges and counts. While brute-force and sorting methods can find duplicates, the direct application of the pigeonhole principle using an auxiliary array offers an efficient solution in terms of time complexity (linear time) and, for certain problem constraints, space complexity. Hash maps provide a more generalized and flexible approach, also offering average linear time performance, making them suitable for broader applications where data ranges might not be contiguous or small. Understanding these varied approaches allows for selecting the most appropriate solution based on specific problem constraints and performance requirements.

Summary

The Pigeonhole Principle states that if you have more items than categories, at least one category must contain multiple items.
It's particularly useful for problems like finding duplicates in a collection of numbers within a limited range.
Brute-force comparison is simple but inefficient for large datasets.
Sorting the data makes duplicate detection easier by grouping identical elements.
An auxiliary array (boolean/frequency map) directly implements the principle by using array indices as "pigeonholes" to mark presence, offering an optimal time complexity (O(N)) for suitable ranges.
Hash maps (e.g., std::unordered_set) offer a flexible and generally efficient (average O(N) time) way to manage "pigeonholes" for various data types and ranges.

Hope, This article was helpful?

Sorry about that

How can we improve it?

Thank you for helping us improve this article.

Pigeonhole Algorithm In C++ Program

Problem Statement

Example

Background & Knowledge Prerequisites

Use Cases or Case Studies

Solution Approaches

1. Brute-Force Search

2. Sorting and Linear Scan

3. Using an Auxiliary Array (Pigeonholes)

4. Using a Hash Map

Conclusion

Summary

Sorry about that

Recent Articles

Related Articles

C++ Program For Quick Sort In Ascending Order...

C++ Program To Implement Topological Sort Algorithm...

Topological Sort In Data Structure And Algorithm In C++...

C++ Program To Display Upper Triangular Matrix...

C++ Program For Heap Sort Using Recursion...

Find Row With Maximum Number Of 1s In Sorted Matrix In C++...

Related Articles

C++ Program For Quick Sort In Ascending Order...

C++ Program To Implement Topological Sort Algorithm...

Topological Sort In Data Structure And Algorithm In C++...

C++ Program To Display Upper Triangular Matrix...

C++ Program For Heap Sort Using Recursion...

Find Row With Maximum Number Of 1s In Sorted Matrix In C++...