C Program To Implement Topological Sort
In this article, you will learn how to implement topological sort in C, exploring two common algorithms: Kahn's algorithm (based on BFS) and a DFS-based approach. You will understand the underlying principles and see practical code examples.
Problem Statement
Topological sort is an algorithm that orders the vertices of a directed acyclic graph (DAG) in such a way that for every directed edge from vertex U to vertex V, U comes before V in the ordering. This ordering is not unique for all DAGs. It is crucial for problems where tasks have dependencies, such as project scheduling, dependency resolution in build systems, or course prerequisite management. Failure to find a topological sort indicates the presence of a cycle in the graph.
Example
Consider a simple directed acyclic graph with dependencies. Tasks: A, B, C, D, E, F Dependencies:
- A -> B
- A -> C
- B -> D
- C -> D
- E -> F
A possible topological sort for this graph would be: E, A, B, C, D, F.
Background & Knowledge Prerequisites
To understand topological sort, it's beneficial to have a grasp of the following concepts:
- Directed Acyclic Graphs (DAGs): Graphs where edges have a direction (U -> V) and contain no cycles.
- Graph Representation: Familiarity with adjacency list representation (an array of linked lists where each index
istores a list of vertices adjacent toi). - Breadth-First Search (BFS): An algorithm for traversing or searching tree or graph data structures. It starts at the tree root (or some arbitrary node of a graph) and explores all of the neighbor nodes at the present depth prior to moving on to the nodes at the next depth level.
- Depth-First Search (DFS): An algorithm for traversing or searching tree or graph data structures. The algorithm starts at the root (or selects some arbitrary node as the root) and explores as far as possible along each branch before backtracking.
- Queues: A linear data structure that follows the First-In, First-Out (FIFO) principle.
- Stacks: A linear data structure that follows the Last-In, First-Out (LIFO) principle.
Use Cases or Case Studies
Topological sort finds applications in various domains where dependencies exist:
- Course Scheduling: Determining the order in which courses must be taken given their prerequisites.
- Project Management: Ordering tasks in a project plan, where some tasks must be completed before others can start.
- Build Systems: Compiling source code files in the correct order, resolving dependencies between modules.
- Data Serialization: Ordering objects to be serialized or deserialized based on their interdependencies.
- Operating System Deadlock Detection: Analyzing resource allocation graphs to detect potential deadlocks.
Solution Approaches
We will explore two primary algorithms for topological sort: Kahn's Algorithm (BFS-based) and a DFS-based approach.
1. Kahn's Algorithm (BFS-based)
Kahn's algorithm iteratively removes nodes that have no incoming edges from the graph and adds them to the topological sort.
- Summary: Uses in-degrees of vertices and a queue to find nodes with no incoming edges, processing them level by level.
Detailed Explanation
- Calculate In-degrees: For every vertex, count the number of incoming edges. Store this in an array, say
inDegree. - Initialize Queue: Add all vertices with an in-degree of 0 to a queue. These are the starting points as they have no prerequisites.
- Process Queue:
- While the queue is not empty:
- Dequeue a vertex
u. - Add
uto the topological order list. - For each neighbor
vofu: - Decrement the in-degree of
v. - If
v's in-degree becomes 0, enqueuev.
- Cycle Detection: If the total number of vertices added to the topological order list is less than the total number of vertices in the graph, it implies there's a cycle, and a topological sort is not possible.
C Code Example (Kahn's Algorithm)
// Topological Sort using Kahn's Algorithm (BFS-based)
#include <stdio.h>
#include <stdlib.h> // For malloc, free
#include <stdbool.h> // For bool
#define MAX_NODES 100
// Structure for an adjacency list node
struct AdjNode {
int dest;
struct AdjNode* next;
};
// Structure for the graph
struct Graph {
int numVertices;
struct AdjNode** adjLists; // Array of linked lists
int* inDegree; // Array to store in-degrees
};
// Function to create an adjacency list node
struct AdjNode* createAdjNode(int dest) {
struct AdjNode* newNode = (struct AdjNode*)malloc(sizeof(struct AdjNode));
newNode->dest = dest;
newNode->next = NULL;
return newNode;
}
// Function to create a graph of 'vertices' number of nodes
struct Graph* createGraph(int vertices) {
struct Graph* graph = (struct Graph*)malloc(sizeof(struct Graph));
graph->numVertices = vertices;
graph->adjLists = (struct AdjNode**)malloc(vertices * sizeof(struct AdjNode*));
graph->inDegree = (int*)calloc(vertices, sizeof(int)); // Initialize in-degrees to 0
for (int i = 0; i < vertices; i++) {
graph->adjLists[i] = NULL;
}
return graph;
}
// Function to add an edge from src to dest
void addEdge(struct Graph* graph, int src, int dest) {
// Add edge from src to dest
struct AdjNode* newNode = createAdjNode(dest);
newNode->next = graph->adjLists[src];
graph->adjLists[src] = newNode;
// Increment in-degree of destination vertex
graph->inDegree[dest]++;
}
// Function to perform topological sort using Kahn's Algorithm
void topologicalSortKahn(struct Graph* graph) {
int* queue = (int*)malloc(graph->numVertices * sizeof(int));
int front = 0, rear = 0;
int* result = (int*)malloc(graph->numVertices * sizeof(int));
int result_idx = 0;
int visited_count = 0;
// Step 1: Initialize queue with all vertices having in-degree 0
for (int i = 0; i < graph->numVertices; i++) {
if (graph->inDegree[i] == 0) {
queue[rear++] = i;
}
}
// Step 2: Process queue
while (front != rear) {
int u = queue[front++]; // Dequeue a vertex
result[result_idx++] = u;
visited_count++;
struct AdjNode* temp = graph->adjLists[u];
while (temp != NULL) {
int v = temp->dest;
graph->inDegree[v]--; // Decrement in-degree of neighbor
// If in-degree becomes 0, enqueue it
if (graph->inDegree[v] == 0) {
queue[rear++] = v;
}
temp = temp->next;
}
}
// Step 3: Check for cycle
if (visited_count != graph->numVertices) {
printf("Graph contains a cycle. Topological Sort not possible.\\n");
} else {
printf("Topological Sort (Kahn's Algorithm): ");
for (int i = 0; i < graph->numVertices; i++) {
printf("%d ", result[i]);
}
printf("\\n");
}
free(queue);
free(result);
}
// Function to free graph memory
void freeGraph(struct Graph* graph) {
for (int i = 0; i < graph->numVertices; i++) {
struct AdjNode* temp = graph->adjLists[i];
while (temp != NULL) {
struct AdjNode* toFree = temp;
temp = temp->next;
free(toFree);
}
}
free(graph->adjLists);
free(graph->inDegree);
free(graph);
}
int main() {
// Example 1: Graph with 6 vertices and 6 edges
// Dependencies: A->B, A->C, B->D, C->D, E->F (using 0-indexed: 0->1, 0->2, 1->3, 2->3, 4->5)
int numVertices1 = 6;
struct Graph* graph1 = createGraph(numVertices1);
addEdge(graph1, 0, 1); // A -> B
addEdge(graph1, 0, 2); // A -> C
addEdge(graph1, 1, 3); // B -> D
addEdge(graph1, 2, 3); // C -> D
addEdge(graph1, 4, 5); // E -> F
topologicalSortKahn(graph1); // Expected: 0 4 1 2 5 3 or 4 0 1 2 5 3 etc.
freeGraph(graph1);
printf("\\n");
// Example 2: Graph with a cycle (0->1, 1->2, 2->0)
int numVertices2 = 3;
struct Graph* graph2 = createGraph(numVertices2);
addEdge(graph2, 0, 1);
addEdge(graph2, 1, 2);
addEdge(graph2, 2, 0); // Creates a cycle
topologicalSortKahn(graph2); // Expected: "Graph contains a cycle..."
freeGraph(graph2);
return 0;
}
Sample Output
Topological Sort (Kahn's Algorithm): 0 4 1 2 5 3
Graph contains a cycle. Topological Sort not possible.
Stepwise Explanation (Kahn's Algorithm)
- Graph Creation: A graph is initialized with a specified number of vertices, an array for adjacency lists, and an array
inDegreeto track incoming edges. - Edge Addition: When an edge
src -> destis added,destis appended tosrc's adjacency list, andinDegree[dest]is incremented. - Queue Initialization: The
topologicalSortKahnfunction starts by iterating through all vertices. Any vertex with aninDegreeof 0 is added to a queue, as these are the initial tasks with no dependencies. - Main Loop: The algorithm then enters a loop that continues as long as the queue is not empty.
- A vertex
uis dequeued and added to theresultlist. - For each neighbor
vofu,inDegree[v]is decremented. - If
inDegree[v]becomes 0, it means all prerequisites forvhave been met, sovis enqueued.
- Cycle Check: After the queue is empty, if the
visited_count(number of vertices added to the result) is less than the total number of vertices, it indicates that some vertices could not be processed due to a cycle.
2. DFS-based Algorithm
The DFS-based algorithm leverages the properties of Depth-First Search to find the topological order.
- Summary: Performs a DFS traversal; vertices are added to a stack *after* all their dependent vertices have been visited.
Detailed Explanation
- Visited States: Maintain a
visitedarray to track the state of each vertex during DFS:
-
0(WHITE): Unvisited -
1(GRAY): Currently visiting (in the recursion stack) -
2(BLACK): Finished visiting (all its descendants visited)
- DFS Traversal: For each unvisited vertex:
- Perform a DFS starting from that vertex.
- Mark the current vertex
uasGRAY(visiting). - For each neighbor
vofu: - If
visGRAY, a cycle is detected. - If
visWHITE, recursively call DFS onv. If the recursive call detects a cycle, propagate it up. - After visiting all neighbors and their descendants, mark
uasBLACK(finished). - Push
uonto a stack.
- Result: The topological order is obtained by popping elements from the stack.
C Code Example (DFS-based Algorithm)
// Topological Sort using DFS-based Algorithm
#include <stdio.h>
#include <stdlib.h> // For malloc, free
#include <stdbool.h> // For bool
#define MAX_NODES 100
// Structure for an adjacency list node
struct AdjNode {
int dest;
struct AdjNode* next;
};
// Structure for the graph
struct GraphDFS {
int numVertices;
struct AdjNode** adjLists; // Array of linked lists
int* visited; // 0: unvisited, 1: visiting (in recursion stack), 2: visited
};
// Structure for a stack
struct Stack {
int* arr;
int top;
int capacity;
};
// Function to create an adjacency list node
struct AdjNode* createAdjNodeDFS(int dest) {
struct AdjNode* newNode = (struct AdjNode*)malloc(sizeof(struct AdjNode));
newNode->dest = dest;
newNode->next = NULL;
return newNode;
}
// Function to create a graph of 'vertices' number of nodes
struct GraphDFS* createGraphDFS(int vertices) {
struct GraphDFS* graph = (struct GraphDFS*)malloc(sizeof(struct GraphDFS));
graph->numVertices = vertices;
graph->adjLists = (struct AdjNode**)malloc(vertices * sizeof(struct AdjNode*));
graph->visited = (int*)calloc(vertices, sizeof(int)); // Initialize all to 0 (unvisited)
for (int i = 0; i < vertices; i++) {
graph->adjLists[i] = NULL;
}
return graph;
}
// Function to add an edge from src to dest
void addEdgeDFS(struct GraphDFS* graph, int src, int dest) {
struct AdjNode* newNode = createAdjNodeDFS(dest);
newNode->next = graph->adjLists[src];
graph->adjLists[src] = newNode;
}
// Stack functions
struct Stack* createStack(int capacity) {
struct Stack* stack = (struct Stack*)malloc(sizeof(struct Stack));
stack->capacity = capacity;
stack->top = -1;
stack->arr = (int*)malloc(capacity * sizeof(int));
return stack;
}
bool isEmpty(struct Stack* stack) {
return stack->top == -1;
}
void push(struct Stack* stack, int item) {
if (stack->top == stack->capacity - 1) {
// Stack overflow, for simplicity in tutorial, we assume enough capacity
return;
}
stack->arr[++stack->top] = item;
}
int pop(struct Stack* stack) {
if (isEmpty(stack)) {
// Stack underflow
return -1; // Or handle error appropriately
}
return stack->arr[stack->top--];
}
// Recursive DFS function for topological sort
bool dfsTopologicalSort(struct GraphDFS* graph, int u, struct Stack* stack, bool* hasCycle) {
graph->visited[u] = 1; // Mark as visiting (GRAY)
struct AdjNode* temp = graph->adjLists[u];
while (temp != NULL) {
int v = temp->dest;
if (graph->visited[v] == 1) { // If neighbor is currently visiting, it's a back-edge (cycle)
*hasCycle = true;
return false; // Cycle detected
}
if (graph->visited[v] == 0) { // If neighbor is unvisited (WHITE)
if (!dfsTopologicalSort(graph, v, stack, hasCycle)) {
return false; // Propagate cycle detection
}
}
temp = temp->next;
}
graph->visited[u] = 2; // Mark as visited (BLACK) - finished processing this node and its descendants
push(stack, u); // Push to stack after all descendants are processed
return true;
}
// Function to perform topological sort using DFS-based algorithm
void topologicalSortDFS(struct GraphDFS* graph) {
struct Stack* stack = createStack(graph->numVertices);
bool hasCycle = false;
for (int i = 0; i < graph->numVertices; i++) {
if (graph->visited[i] == 0) { // If unvisited
if (!dfsTopologicalSort(graph, i, stack, &hasCycle)) {
break; // Cycle detected, no need to continue
}
}
}
if (hasCycle) {
printf("Graph contains a cycle. Topological Sort not possible.\\n");
} else {
printf("Topological Sort (DFS-based Algorithm): ");
while (!isEmpty(stack)) {
printf("%d ", pop(stack));
}
printf("\\n");
}
free(stack->arr);
free(stack);
}
// Function to free graph memory
void freeGraphDFS(struct GraphDFS* graph) {
for (int i = 0; i < graph->numVertices; i++) {
struct AdjNode* temp = graph->adjLists[i];
while (temp != NULL) {
struct AdjNode* toFree = temp;
temp = temp->next;
free(toFree);
}
}
free(graph->adjLists);
free(graph->visited);
free(graph);
}
int main() {
// Example 1: Graph with 6 vertices and 6 edges
// Dependencies: A->B, A->C, B->D, C->D, E->F (using 0-indexed: 0->1, 0->2, 1->3, 2->3, 4->5)
int numVertices1 = 6;
struct GraphDFS* graph1 = createGraphDFS(numVertices1);
addEdgeDFS(graph1, 0, 1); // A -> B
addEdgeDFS(graph1, 0, 2); // A -> C
addEdgeDFS(graph1, 1, 3); // B -> D
addEdgeDFS(graph1, 2, 3); // C -> D
addEdgeDFS(graph1, 4, 5); // E -> F
topologicalSortDFS(graph1); // Expected: 4 0 2 1 3 5 or 4 0 1 2 3 5 etc.
freeGraphDFS(graph1);
printf("\\n");
// Example 2: Graph with a cycle (0->1, 1->2, 2->0)
int numVertices2 = 3;
struct GraphDFS* graph2 = createGraphDFS(numVertices2);
addEdgeDFS(graph2, 0, 1);
addEdgeDFS(graph2, 1, 2);
addEdgeDFS(graph2, 2, 0); // Creates a cycle
topologicalSortDFS(graph2); // Expected: "Graph contains a cycle..."
freeGraphDFS(graph2);
return 0;
}
Sample Output
Topological Sort (DFS-based Algorithm): 4 0 1 2 3 5
Graph contains a cycle. Topological Sort not possible.
Stepwise Explanation (DFS-based Algorithm)
- Graph and Visited Array Initialization: A graph is created, and a
visitedarray (initialized to 0 for all nodes, meaningunvisited) is used to track the state of nodes during DFS. - Stack Creation: A stack is initialized to store the topological order.
- Iterate Vertices: The
topologicalSortDFSfunction iterates through all vertices. If a vertexiisunvisited(state 0), a DFS traversaldfsTopologicalSortis initiated fromi. - DFS Function (
dfsTopologicalSort):
- Mark
visiting: The current vertexuis marked asvisiting(state 1). - Explore Neighbors: It then iterates through all neighbors
vofu. - If
vis alreadyvisiting(state 1), a back-edge is found, indicating a cycle. ThehasCycleflag is set, and the function returnsfalse. - If
visunvisited(state 0), a recursive DFS call is made forv. If this recursive call finds a cycle, it propagates thefalsereturn. - Mark
visitedand Push to Stack: Once all neighbors and their descendants have been processed (or a cycle has been detected and propagated),uis marked asvisited(state 2), and then pushed onto the stack. This ensures that a vertex is pushed only after all its dependent tasks have been added.
- Result Retrieval: After the main loop finishes, if no cycle was detected, the elements are popped from the stack. The sequence of popped elements gives the topological order.
Conclusion
Topological sort is a fundamental algorithm for directed acyclic graphs, essential for ordering tasks with dependencies. Kahn's algorithm offers a BFS-based approach using in-degrees, making it intuitive for visualizing the "no prerequisites" concept. The DFS-based algorithm leverages recursion and a stack to build the order by processing nodes only after all their descendants. Both methods effectively solve the problem and provide robust cycle detection.
Summary
- Topological sort orders vertices in a DAG such that dependencies are respected.
- It's used for scheduling, dependency resolution, and build systems.
- Graphs must be acyclic; cycles prevent a topological sort.
- Kahn's Algorithm (BFS-based):
- Uses in-degrees of vertices.
- Processes nodes with 0 in-degree first, then updates neighbors' in-degrees.
- Uses a queue to manage nodes ready for processing.
- DFS-based Algorithm:
- Uses recursion and a stack.
- A node is pushed onto the stack only after all its adjacent nodes (and their descendants) have been visited.
- The final order is obtained by popping from the stack.
- Both algorithms include mechanisms for detecting cycles in the graph.