Depth-first Search (DFS)
Explore how depth-first search (DFS) is used to traverse graphs by visiting the deepest nodes first. Understand node and edge classifications during DFS, and learn how to apply DFS to check graph connectivity and detect cycles. This lesson helps you master fundamental traversal techniques essential for graph algorithms in C++.
We'll cover the following...
After settling the basic terminology and learning about graph representations, let’s dive into our first actual graph algorithm. In this section, we’ll look into different ways of traversing graphs.
Unlike totally or partially ordered data structures, such as lists or trees, there is no canonical order in which the nodes of a graph should be explored. However, there are two standard algorithms to traverse graphs that define an order in which the nodes are visited: depth-first search (DFS) and breadth-first search (BFS). Although the algorithms are elementary, they can already be applied to solve a wide variety of graph problems with just basic modifications.
In this lesson, we will start by looking into depth-first search.
Explanation of DFS
Depth-first search is a method to traverse a graph where the deepest not yet explored node is always explored first.
To illustrate the concept let’s consider the following example graph:
The node was not discovered during our search, because it is unreachable from our starting node . Depending on our application, we might want to continue running another DFS starting from node to explore the remaining nodes and edges.
To recap, during DFS we can classify nodes into three states:
- Undiscovered.
- In progress, has unchecked outgoing edges.
- Finished.
We also classify edges as either:
- Discovery edges, which leads to previously undiscovered nodes.
- Back edges, which leads from in-progress nodes to other in-progress nodes.
- Redundant edges, which leads from in-progress nodes to finished nodes.
In literature, redundant edges are sometimes further classified into forward edges and cross edges, but we do not need this distinction for this course.
When running DFS on an undirected graph, there can be no redundant edges.
All edges are either discovery edges or back edges.
DFS results are not unique
Note that in general, the starting node will determine which nodes will be discovered during the DFS. Also, the order in which the neighbors of a node are visited can determine whether an edge is labeled as a discovery edge, back edge, or redundant edge. For example, here is the result of running DFS from starting node instead:
As we can see, all nodes are marked as finished (blue) when starting from , and there are no redundant edges. However, we still have back edges. If one DFS execution that visits all nodes finds back edges, then every such DFS execution will find back edges.
Simple applications
The basic DFS algorithms can already be used to solve some elementary (but important!) graph problems.
Connectivity check
We can use DFS to check whether an undirected graph is connected. Hence, we can also check whether a directed graph is weakly connected. To do so, we run DFS from an arbitrary starting vertex. Once the vertex is marked as finished (blue), we check if there are any white vertices left. If so, then the graph is definitely not connected. Otherwise, it is at least weakly connected.
We can also apply DFS to find out whether a graph is strongly connected, but the algorithm is a bit more complex. We’ll discuss this algorithm in its own lesson.
Cycle detection
We can use DFS to detect cycles in graphs. For this, we’ll start with DFS from an arbitrary vertex. After finishing that vertex, we’ll continue with DFS from a remaining white vertex, until all vertices are finished (blue).
If we find a back edge during the execution of DFS that leads from the current node to another node that is still in progress (gray), we’ve found a cycle in the graph. This indicates that if a node is still in progress, there must be a path from it to the current node, and the back edge thus completes a cycle. If such a back edge is never discovered, the graph is acyclic.