C++ Program To Implement Topological Sort Algorithm
Topological Sort is an algorithm used to find a linear ordering of vertices in a directed acyclic graph (DAG) such that for every directed edge (u, v), vertex u comes before vertex v in the ordering. This ordering is crucial for tasks where dependencies exist, ensuring that prerequisites are met before subsequent tasks begin. In this article, you will learn how to implement the Topological Sort algorithm in C++ using both Depth-First Search (DFS) and Kahn's algorithm.
Problem Statement
The core problem is to arrange the vertices of a Directed Acyclic Graph (DAG) into a sequence where for every directed edge from vertex A to vertex B, A appears before B in the sequence. If a graph contains a cycle, a topological sort is not possible, as there would be no way to satisfy the "comes before" condition for all edges within the cycle. This problem frequently arises in scenarios requiring dependency resolution.
Example
Consider a graph representing course prerequisites:
-
Math->Physics -
Math->Chemistry -
Physics->Advanced Physics -
Chemistry->Biochemistry -
Data Structures->Algorithms
A possible topological sort for these courses would be:
Math, Data Structures, Physics, Chemistry, Advanced Physics, Biochemistry, Algorithms
Note that Physics and Chemistry can appear in any order relative to each other as long as Math comes before both, and Advanced Physics comes after Physics, etc. Data Structures and Math are independent starting points.
Background & Knowledge Prerequisites
To effectively understand and implement topological sort, it's beneficial to have a grasp of:
- Directed Acyclic Graphs (DAGs): Understanding what a directed graph is and the concept of a graph without cycles.
- Graph Representation: Familiarity with representing graphs, typically using an adjacency list. An adjacency list uses an array or vector where each index
istores a list of vertices adjacent to vertexi. - Depth-First Search (DFS): Knowledge of how DFS traverses a graph, exploring as far as possible along each branch before backtracking.
- Breadth-First Search (BFS): Understanding how BFS explores a graph level by level, useful for Kahn's algorithm.
- Basic C++ Data Structures: Proficiency with
std::vector,std::stack, andstd::queue.
Use Cases or Case Studies
Topological sort is a fundamental algorithm with various practical applications across different domains:
- Task Scheduling: In project management or build systems (like Makefiles), tasks often have dependencies. Topological sort helps in determining a valid order to execute tasks, ensuring all prerequisites are met.
- Course Prerequisite Systems: As seen in the example, universities use topological sort implicitly to determine the order in which courses must be taken.
- Dependency Resolution in Package Managers: Software package managers (e.g.,
apt,npm,pip) use topological sort to install packages and their dependencies in the correct order. - Data Serialization: In some systems, objects might depend on other objects. Topological sort can determine a valid order for saving or loading these objects to maintain referential integrity.
- Instruction Scheduling in Compilers: Compilers can use topological sort to reorder instructions in a program to optimize performance, respecting data dependencies.
Solution Approaches
There are two primary algorithms for implementing topological sort: one based on Depth-First Search (DFS) and another known as Kahn's Algorithm, which uses a BFS-like approach.
Approach 1: Topological Sort using Depth-First Search (DFS)
This approach leverages the properties of DFS. When a DFS finishes exploring all neighbors of a node, and then the node itself, it means that node can be placed in the topological order *before* any of its dependents. By pushing nodes onto a stack *after* visiting all their adjacent nodes, we naturally get a reverse topological order. Popping elements from this stack then gives the correct order.
- One-line summary: Recursively explore all neighbors of a node, and only push the node onto a result stack after all its dependencies have been processed.
// DFS-based Topological Sort
#include <iostream>
#include <vector>
#include <stack>
#include <list> // For adjacency list
using namespace std;
// Graph class to represent a directed graph
class Graph {
int V; // Number of vertices
list<int>* adj; // Pointer to an array containing adjacency lists
// A recursive function used by topologicalSort
void topologicalSortUtil(int v, vector<bool>& visited, stack<int>& Stack) {
// Mark the current node as visited
visited[v] = true;
// Recur for all the vertices adjacent to this vertex
for (int neighbor : adj[v]) {
if (!visited[neighbor]) {
topologicalSortUtil(neighbor, visited, Stack);
}
}
// Push current vertex to stack which stores result
Stack.push(v);
}
public:
Graph(int V) {
this->V = V;
adj = new list<int>[V];
}
// Function to add an edge to graph
void addEdge(int u, int v) {
adj[u].push_back(v);
}
// Prints a topological sort of the complete graph
void topologicalSort() {
stack<int> Stack;
vector<bool> visited(V, false); // Mark all the vertices as not visited
// Call the recursive helper function for all vertices
// to get topological sort
for (int i = 0; i < V; i++) {
if (!visited[i]) {
topologicalSortUtil(i, visited, Stack);
}
}
// Print contents of stack
cout << "Topological Sort (DFS-based): ";
while (!Stack.empty()) {
cout << Stack.top() << " ";
Stack.pop();
}
cout << endl;
}
};
int main() {
// Step 1: Create a graph
Graph g(6); // 6 vertices (0 to 5)
// Step 2: Add edges representing dependencies
g.addEdge(5, 2);
g.addEdge(5, 0);
g.addEdge(4, 0);
g.addEdge(4, 1);
g.addEdge(2, 3);
g.addEdge(3, 1);
// Step 3: Perform topological sort
g.topologicalSort();
return 0;
}
- Sample Output:
Topological Sort (DFS-based): 4 5 2 3 0 1
4 5 can be 5 4, 0 1 can vary depending on which unvisited node is picked first in the for loop, as long as dependencies are met.)
- Stepwise explanation:
- Initialization: A stack
Stackis created to store the topological order. Avisitedboolean array tracks visited nodes, initialized tofalsefor all vertices. - Iterate Vertices: The
topologicalSortfunction iterates through all vertices (from 0 to V-1). If a vertexihas not been visited, it calls a recursive helper functiontopologicalSortUtilfori. This ensures all connected components are covered. - DFS Helper (
topologicalSortUtil):
- Mark the current vertex
vasvisited. - For each neighbor
neighborofv: ifneighboris notvisited, recursively calltopologicalSortUtilonneighbor. - After all neighbors of
v(and their descendants) have been visited and processed, pushvontoStack. This is crucial: a node is pushed only after all its dependent nodes have been visited.
- Print Result: After the main loop finishes, the
Stackcontains the vertices in reverse topological order. Pop elements from the stack and print them to get the correct topological order.
Approach 2: Topological Sort using Kahn's Algorithm (BFS-based)
Kahn's algorithm takes a different, iterative approach. It identifies vertices with an in-degree of 0 (no incoming edges), as these are the ones that can be processed first. Once a vertex is processed, it's "removed" from the graph conceptually by decrementing the in-degrees of its neighbors. New vertices with an in-degree of 0 are then added to a queue, and the process continues.
- One-line summary: Find all nodes with no incoming edges, add them to a queue, and iteratively process them, decrementing neighbors' in-degrees and adding new in-degree-zero nodes to the queue.
// Kahn's Algorithm (BFS-based) Topological Sort
#include <iostream>
#include <vector>
#include <queue>
#include <list> // For adjacency list
using namespace std;
// Graph class to represent a directed graph
class Graph {
int V; // Number of vertices
list<int>* adj; // Pointer to an array containing adjacency lists
public:
Graph(int V) {
this->V = V;
adj = new list<int>[V];
}
// Function to add an edge to graph
void addEdge(int u, int v) {
adj[u].push_back(v);
}
// Prints a topological sort of the complete graph using Kahn's algorithm
void topologicalSortKahn() {
vector<int> in_degree(V, 0); // Stores in-degree of all vertices
// Calculate in-degrees for all vertices
for (int u = 0; u < V; u++) {
for (int v : adj[u]) {
in_degree[v]++;
}
}
queue<int> q; // Queue to store vertices with in-degree 0
// Enqueue all vertices with in-degree 0
for (int i = 0; i < V; i++) {
if (in_degree[i] == 0) {
q.push(i);
}
}
vector<int> top_order; // Vector to store the topological order
int count_visited_nodes = 0;
// One by one dequeue vertices from queue and add them to top_order
// Then, for all its adjacent vertices, decrement their in-degree
// If an adjacent vertex's in-degree becomes 0, enqueue it
while (!q.empty()) {
int u = q.front();
q.pop();
top_order.push_back(u);
count_visited_nodes++;
// Iterate through all neighbors of u
for (int v : adj[u]) {
in_degree[v]--; // Decrement in-degree of v
if (in_degree[v] == 0) {
q.push(v); // If in-degree becomes 0, add to queue
}
}
}
// Check for cycle: if topological order doesn't include all nodes
if (count_visited_nodes != V) {
cout << "Graph has a cycle, topological sort not possible!" << endl;
return;
}
// Print the topological order
cout << "Topological Sort (Kahn's Algorithm): ";
for (int node : top_order) {
cout << node << " ";
}
cout << endl;
}
};
int main() {
// Step 1: Create a graph
Graph g(6); // 6 vertices (0 to 5)
// Step 2: Add edges representing dependencies
g.addEdge(5, 2);
g.addEdge(5, 0);
g.addEdge(4, 0);
g.addEdge(4, 1);
g.addEdge(2, 3);
g.addEdge(3, 1);
// Step 3: Perform topological sort using Kahn's algorithm
g.topologicalSortKahn();
return 0;
}
- Sample Output:
Topological Sort (Kahn's Algorithm): 4 5 0 2 3 1
- Stepwise explanation:
- Calculate In-degrees: An array
in_degreeis initialized for all vertices, storing the count of incoming edges for each. This is computed by iterating through all edges. - Initialize Queue: A queue
qis created, and all vertices with anin_degreeof 0 are added to it. These are the starting points for the topological sort. - Process Queue:
- While the queue is not empty:
- Dequeue a vertex
u. Addutotop_order(the result list). Incrementcount_visited_nodes. - For each neighbor
vofu: decrementin_degree[v]. This simulates removing the edge (u, v) anduitself. - If
in_degree[v]becomes 0, enqueuev. These new nodes now have no pending dependencies and can be processed.
- Cycle Detection: After the queue is empty, if
count_visited_nodesis not equal to the total number of verticesV, it means some vertices were never added to the queue, indicating the presence of a cycle in the graph. - Print Result: Print the
top_ordervector.
Conclusion
Topological sort is a powerful algorithm for linearizing dependencies in directed acyclic graphs. Both the DFS-based approach and Kahn's algorithm (BFS-based) effectively solve this problem, with the DFS method leveraging recursion and a stack, and Kahn's algorithm using iteration and a queue. While both have a time complexity of O(V + E) for V vertices and E edges, Kahn's algorithm offers a straightforward way to detect cycles.
Summary
- Purpose: Orders vertices in a Directed Acyclic Graph (DAG) such that for every edge (u, v), u appears before v.
- Applicability: Only works on DAGs; a cycle indicates no topological sort is possible.
- DFS-based Approach:
- Uses a recursive DFS traversal.
- Nodes are pushed onto a stack *after* all their dependents have been processed.
- Popping from the stack yields the topological order.
- Kahn's Algorithm (BFS-based):
- Iterative, using a queue.
- Starts with nodes having an in-degree of 0.
- Decrement neighbors' in-degrees as nodes are processed; add new in-degree-zero nodes to the queue.
- Can detect cycles if not all nodes are included in the final order.
- Common Use Cases: Task scheduling, course prerequisites, dependency resolution in software.