Python Topological Sorting
## Understanding Topological Sorting in Python
**Topological Sorting** for a **Directed Acyclic Graph (DAG)** is a linear ordering of its vertices such that for every directed edge $u \rightarrow v$, vertex $u$ comes before $v$ in the ordering.
In simpler terms, topological sorting transforms a partial order (where only some elements have a defined sequence relative to each other) into a total order (where all elements are arranged in a complete, linear sequence).
### Key Characteristics of a Topological Sort
For a sequence of vertices to be a valid topological sort of a graph, it must satisfy the following conditions:
* **Completeness:** Every vertex in the graph must appear in the sequence exactly once.
* **Precedence:** If there is a path from vertex $A$ to vertex $B$ in the graph, then $A$ must appear before $B$ in the sequence.
> **Note:** Topological sorting is only possible if and only if the graph is a **Directed Acyclic Graph (DAG)**. If the graph contains any cycles, a topological ordering cannot exist because there would be no clear "starting" point for the cycle.
---
## How Topological Sorting Works
There are two primary algorithms used to find a topological sort:
1. **Depth-First Search (DFS) Based Algorithm:** Uses recursion and a stack to order vertices.
2. **Kahn's Algorithm (BFS-Based):** Uses in-degrees (the number of incoming edges to a vertex) and a queue.
Below, we will focus on the **DFS-based approach**, which utilizes a helper stack to store the ordered vertices.
### The DFS-Based Algorithm Steps:
1. Initialize a boolean array `visited` to keep track of visited vertices, and an empty list/stack to store the sorted order.
2. For each unvisited vertex in the graph, call a recursive helper function (`topologicalSortUtil`).
3. In the helper function:
* Mark the current vertex as visited.
* Recursively call the helper function for all its adjacent, unvisited vertices.
* Once all adjacent vertices are visited, push (or insert at the beginning of) the current vertex to the stack.
4. The final state of the stack represents the topological sort of the graph.
---
## Python Implementation
Here is a complete Python implementation of Topological Sorting using the DFS-based approach:
```python
from collections import defaultdict
class Graph:
def __init__(self, vertices):
# Dictionary to store the adjacency list
self.graph = defaultdict(list)
# Number of vertices in the graph
self.V = vertices
def addEdge(self, u, v):
"""Add a directed edge from vertex u to vertex v."""
self.graph.append(v)
def topologicalSortUtil(self, v, visited, stack):
"""A recursive helper function used by topologicalSort."""
# Mark the current node as visited
visited = True
# Recur for all the vertices adjacent to this vertex
for i in self.graph:
if not visited:
self.topologicalSortUtil(i, visited, stack)
# Insert current vertex at the beginning of the stack
stack.insert(0, v)
def topologicalSort(self):
"""Perform Topological Sort on the graph."""
# Mark all the vertices as not visited
visited = * self.V
stack = []
# Call the recursive helper function to store Topological Sort
# starting from all vertices one by one
for i in range(self.V):
if not visited:
self.topologicalSortUtil(i, visited, stack)
print(stack)
# Create a graph with 6 vertices
g = Graph(6)
g.addEdge(5, 2)
g.addEdge(5, 0)
g.addEdge(4, 0)
g.addEdge(4, 1)
g.addEdge(2, 3)
g.addEdge(3, 1)
print("Topological Sort Result:")
g.topologicalSort()
```
### Output
When you run the code above, it produces the following output:
```text
Topological Sort Result:
[5, 4, 2, 3, 1, 0]
```
---
## Complexity Analysis
* **Time Complexity:** $\mathcal{O}(V + E)$, where $V$ is the number of vertices and $E$ is the number of edges. The algorithm visits every vertex and traverses every edge exactly once.
* **Space Complexity:** $\mathcal{O}(V)$. The extra space is needed for the recursion stack, the `visited` array, and the output `stack`.
---
## Real-World Applications
Topological sorting is widely used in scenarios where tasks have dependencies on other tasks. Common use cases include:
* **Build Systems:** Resolving file dependencies (e.g., compiling source files in a specific order, like in `Make` or `Gradle`).
* **Package Managers:** Resolving software package dependencies (e.g., `pip` or `npm` installing packages in the correct order).
* **Task Scheduling:** Scheduling a sequence of jobs or tasks where certain tasks must be completed before others can begin.
* **Instruction Scheduling:** Optimizing the execution order of instructions in compilers.
YouTip