Write Code To Check If Two Strings Are Anagram Or Not In C Program
Anagrams are words or phrases formed by rearranging the letters of another, using all the original letters exactly once. For example, "listen" and "silent" are anagrams. Detecting if two strings are anagrams is a common programming challenge that involves character manipulation and comparison. In this article, you will learn how to implement two distinct approaches in C to determine if two given strings are anagrams.
Problem Statement
The core problem is to write a C program that can reliably determine whether two input strings are anagrams of each other. This means checking if they contain the same characters with the same frequencies, irrespective of their order. This is a fundamental string manipulation task often encountered in data processing, puzzle-solving, and text analysis.
Example
Consider the following two strings:
- String 1: "heart"
- String 2: "earth"
These are anagrams because "heart" can be rearranged to form "earth," using each letter exactly once.
Background & Knowledge Prerequisites
To understand the solutions presented, readers should have a basic understanding of:
- C Language Fundamentals: Variables, data types, loops (for, while), conditional statements (if-else).
- Strings in C: How strings are represented as character arrays, null termination (
\0), and common string functions fromstring.hlikestrlenandstrcmp. - Arrays: Basic array manipulation and indexing.
Essential header files for the solutions include for input/output, for string operations, and for functions like qsort (for the sorting approach).
Use Cases or Case Studies
Anagram detection has several practical applications:
- Word Games and Puzzles: Validating words in games like Scrabble or creating word puzzles.
- Text Analysis: Identifying textual similarities or detecting plagiarism at a basic character level.
- Data Validation: Checking for data entry errors where character order might change but content should remain the same (e.g., specific codes).
- Cryptography: Simple forms of character-based code generation or breaking, where rearrangements are key.
- Educational Tools: Helping students learn about word structures and letter combinations.
Solution Approaches
We will explore two effective methods for checking if two strings are anagrams: sorting and character frequency counting.
Approach 1: Sorting Strings
This method involves sorting both strings alphabetically and then comparing the sorted versions. If the sorted strings are identical, the original strings are anagrams.
- Summary: Sort both input strings and then compare the sorted strings. If they are equal, the original strings are anagrams.
// Anagram Check (Sorting Method)
#include <stdio.h>
#include <string.h>
#include <stdlib.h> // For qsort
// Comparison function for qsort
int compareChars(const void *a, const void *b) {
return (*(char*)a - *(char*)b);
}
// Function to check if two strings are anagrams using sorting
int areAnagramsSorting(char *str1, char *str2) {
// Step 1: Get lengths of both strings
int len1 = strlen(str1);
int len2 = strlen(str2);
// Step 2: If lengths are different, they cannot be anagrams
if (len1 != len2) {
return 0; // Not anagrams
}
// Step 3: Create copies of the strings to avoid modifying originals
char *s1_copy = (char *)malloc(sizeof(char) * (len1 + 1));
char *s2_copy = (char *)malloc(sizeof(char) * (len2 + 1));
if (s1_copy == NULL || s2_copy == NULL) {
perror("Memory allocation failed");
exit(EXIT_FAILURE);
}
strcpy(s1_copy, str1);
strcpy(s2_copy, str2);
// Step 4: Sort both copied strings
qsort(s1_copy, len1, sizeof(char), compareChars);
qsort(s2_copy, len2, sizeof(char), compareChars);
// Step 5: Compare the sorted strings
int result = strcmp(s1_copy, s2_copy) == 0;
// Step 6: Free dynamically allocated memory
free(s1_copy);
free(s2_copy);
return result;
}
int main() {
char str1[] = "listen";
char str2[] = "silent";
char str3[] = "hello";
char str4[] = "world";
printf("Checking \\"%s\\" and \\"%s\\":\\n", str1, str2);
if (areAnagramsSorting(str1, str2)) {
printf("They are anagrams.\\n");
} else {
printf("They are not anagrams.\\n");
}
printf("\\nChecking \\"%s\\" and \\"%s\\":\\n", str3, str4);
if (areAnagramsSorting(str3, str4)) {
printf("They are anagrams.\\n");
} else {
printf("They are not anagrams.\\n");
}
return 0;
}
- Sample Output:
Checking "listen" and "silent":
They are anagrams.
Checking "hello" and "world":
They are not anagrams.
- Stepwise Explanation:
- Length Check: The function first checks if the lengths of the two strings are equal. If they are not, they cannot be anagrams, and the function returns
0. - String Copies: Dynamic memory is allocated to create copies of the input strings. This is crucial because
qsortmodifies the string in place, and we want to preserve the original strings. - Sorting: The
qsortfunction fromstdlib.his used to sort the characters in both copied strings alphabetically. A custom comparison functioncompareCharsis provided forqsortto compare individual characters. - Comparison: After sorting,
strcmpis used to compare the two sorted strings. Ifstrcmpreturns0, it means the strings are identical, and thus the original strings were anagrams. - Memory Management: The dynamically allocated memory for the string copies is freed to prevent memory leaks.
Approach 2: Character Counting (Frequency Array)
This method involves counting the frequency of each character in both strings. If the character counts match for all characters, the strings are anagrams. This approach is generally more efficient for longer strings.
- Summary: Use an array (or hash map concept) to store character frequencies for the first string, then decrement counts using the second string. If all counts are zero at the end, they are anagrams.
// Anagram Check (Character Counting Method)
#include <stdio.h>
#include <string.h>
#include <stdlib.h> // For EXIT_FAILURE and dynamic allocation if needed, not strictly for this version
// Assuming ASCII character set for simplicity (0-255)
#define MAX_CHARS 256
// Function to check if two strings are anagrams using character counting
int areAnagramsCounting(char *str1, char *str2) {
// Step 1: Declare a frequency array and initialize it to zeros
int count[MAX_CHARS] = {0};
// Step 2: Get lengths of both strings
int len1 = strlen(str1);
int len2 = strlen(str2);
// Step 3: If lengths are different, they cannot be anagrams
if (len1 != len2) {
return 0; // Not anagrams
}
// Step 4: For the first string, increment count for each character
for (int i = 0; i < len1; i++) {
count[(unsigned char)str1[i]]++;
}
// Step 5: For the second string, decrement count for each character
for (int i = 0; i < len2; i++) {
count[(unsigned char)str2[i]]--;
}
// Step 6: Check if all counts in the array are zero
// If any count is non-zero, it means character frequencies don't match
for (int i = 0; i < MAX_CHARS; i++) {
if (count[i] != 0) {
return 0; // Not anagrams
}
}
// Step 7: If all counts are zero, they are anagrams
return 1;
}
int main() {
char str1[] = "anagram";
char str2[] = "nagaram";
char str3[] = "rat";
char str4[] = "car";
printf("Checking \\"%s\\" and \\"%s\\":\\n", str1, str2);
if (areAnagramsCounting(str1, str2)) {
printf("They are anagrams.\\n");
} else {
printf("They are not anagrams.\\n");
}
printf("\\nChecking \\"%s\\" and \\"%s\\":\\n", str3, str4);
if (areAnagramsCounting(str3, str4)) {
printf("They are anagrams.\\n");
} else {
printf("They are not anagrams.\\n");
}
return 0;
}
- Sample Output:
Checking "anagram" and "nagaram":
They are anagrams.
Checking "rat" and "car":
They are not anagrams.
- Stepwise Explanation:
- Frequency Array: An integer array
countof sizeMAX_CHARS(256 for ASCII) is declared and initialized to all zeros. This array will store the frequency of each character. - Length Check: Similar to the sorting method, an initial length check quickly filters out non-anagrams.
- Increment Counts: The first string is traversed. For each character encountered, its corresponding count in the
countarray is incremented. The(unsigned char)cast handles potential negative values for signed chars, ensuring correct array indexing. - Decrement Counts: The second string is then traversed. For each character in the second string, its corresponding count in the
countarray is decremented. - Final Check: Finally, the
countarray is iterated through. If all elements in the array are0, it means every character that was incremented by the first string was decremented by the second string exactly once, indicating they have the same character frequencies. If any element is non-zero, the strings are not anagrams.
Conclusion
Both sorting and character counting methods effectively determine if two strings are anagrams. The sorting method is intuitive and easier to grasp, but its time complexity is typically O(N log N) due to the sorting step. The character counting method is often more efficient, offering a time complexity of O(N) (where N is the length of the string, plus the constant time for iterating through the fixed-size frequency array), making it a preferred choice for larger inputs. The choice between methods often depends on the specific constraints of the problem and the desired balance between simplicity and performance.
Summary
- Anagrams are strings with the same characters but in a different order.
- The Sorting Method involves sorting both strings and comparing the results.
- Pros: Straightforward logic, leverages standard library sort functions.
- Cons: Can be less efficient for very long strings (O(N log N)).
- The Character Counting Method uses a frequency array to track character occurrences.
- Pros: Generally more efficient (O(N) time complexity).
- Cons: Requires additional space for the frequency array, might need adjustments for non-ASCII character sets.
- Always perform an initial length check; strings of different lengths cannot be anagrams.