# Algorithms Third Edition in C++ Part 5. Graph Algorithms (2006)

### CHAPTER NINETEEN

Digraphs and DAGs

### 19.7 Reachability in DAGs

To conclude our study of DAGs, we consider the problem of computing the transitive closure of a DAG. Can we develop algorithms for DAGs that are more efficient than the algorithms for general digraphs that we considered in *Section 19.3*?

Any method for topological sorting can serve as the basis for a transitive-closure algorithm for DAGs, as follows: We proceed through the vertices in reverse topological order, computing the reachability vector for each vertex (its row in the transitive-closure matrix) from the rows corresponding to its adjacent vertices. The reverse topological sort ensures that all those rows have already been computed. In total, we check each of the *V* entries in the vector corresponding to the destination vertex of each of the *E* edges, for a total running time proportional to *V E*. Although it is simple to implement, this method is no more efficient for DAGs than for general digraphs.

When we use a standard DFS for the topological sort (see *Program 19.7*), we can improve performance for some DAGs, as shown in *Program 19.9*. Since there are no cycles in a DAG, there are no back edges in any DFS. More important, both cross edges and down edges point to nodes for which the DFS has completed. To take advantage of this fact, we develop a recursive function to compute all vertices reachable from a given start vertex, but (as usual in DFS) we make no recursive calls for vertices for which the reachable set has already been computed. In this case, the reachable vertices are represented by a row in the transitive closure, and the recursive function takes the logical *or* of all the rows associated with its adjacent edges. For tree edges, we do a recursive call to compute the row; for cross edges, we can skip the recursive call because we know that the row has been computed by a previous recursive call; for down edges, we can skip the whole computation because any reachable nodes that it would add have already been accounted for in the set of reachable nodes for the destination vertex (lower and earlier in the DFS tree).

Using this version of DFS might be characterized as using dynamic programming to compute the transitive closure because we make use of results that have already been computed (and saved in the adjacency matrix rows corresponding to previously-processed vertices) to avoid making unnecessary recursive calls. *Figure 19.27* illustrates the computation of the transitive closure for the sample DAG in *Figure 19.6*.

**Property 19.13** *With dynamic programming and DFS, we can support constant query time for the abstract transitive closure of a DAG with space proportional to V*^{2} *and time proportional to V*^{2} + *VX for preprocessing (computing the transitive closure), where X is the number of cross edges in the DFS forest.*

**Figure 19.27 Transitive closure of a DAG**

This sequence of row vectors is the transitive closure of the DAG in *Figure 19.21*, with rows created in reverse topological order, computed as the last action in a recursive DFS function (see Program *19.9*). Each row is the logical or of the rows for adjacent vertices, which appear earlier in the list. For example, to compute the row for 0 we take the logical or of the rows for 5, 2, 1, and 6 (and put a 1 corresponding to 0 itself) because the edges 0-5, 0-2, 0-1, and 0-6 take us from 0 to any vertex that is reachable from any of those vertices. We can ignore down edges because they add no new information. For example, we ignore the edge from 0 to 3 because the vertices reachable from 3 are already accounted for in the row corresponding to 2.

**Program 19.9 Transitive closure of a DAG**

The constructor in this class computes the transitive closure of a DAG with a single DFS. It recursively computes the reachable vertices from each vertex from the reachable vertices of its children in the DFS tree.

*Proof*: The proof is immediate by induction from the recursive function in Program 19.9. We visit the vertices in reverse topological order. Every edge points to a vertex for which we have already computed all reachable vertices, and we can therefore compute the set of reachable vertices of any vertex by merging together the sets of reachable vertices associated with the destination vertex of each edge. Taking the logical *or* of the specified rows in the adjacency matrix accomplishes this merge. We access a row of size *V* for each tree edge and each cross edge. There are no back edges, and we can ignore down edges because we accounted for any vertices they reach when we processed any ancestors of both nodes earlier in the search.

If our DAG has no down edges (see Exercise 19.42), the running time of *Program 19.9* is proportional to *V E* and represents no improvement over the transitive-closure algorithms that we examined for general digraphs in *Section 19.3* (such as, for example, *Program 19.4*) or the approach based on topological sorting that is described at the beginning of this section. On the other hand, if the number of down edges is large (or, equivalently, the number of cross edges is small), *Program 19.9* will be significantly faster than these methods.

The problem of finding an optimal algorithm (one that is guaranteed to finish in time proportional to *V*^{2}) for computing the transitive closure of dense DAGs is still unsolved. The best known worst-case performance bound is *V E*. However, we are certainly better off using an algorithm that runs faster for a large class of DAGs, such as *Program 19.9*, than we are using one that always runs in time proportional to *V E*, such as *Program 19.4*. As we see in *Section 19.9*, this performance improvement for DAGs has direct implications for our ability to compute the transitive closure of general digraphs as well.

**Exercises**

• **19.115** Show, in the style of *Figure 19.27*, the reachability vectors that result when we use *Program 19.9* to compute the transitive closure of the DAG

3-7 1-4 7-8 0-5 5-2 3-8 2-9 0-6 4-9 2-6 6-4 4-3 2-3.

• **19.116** Develop a version of *Program 19.9* that uses a representation of the transitive closure that does not support edge testing and that runs in time proportional to *V*^{2} + *Σ*^{e}*v* (*e*), where the sum is over all edges in the DAG and *v* (*e*) is the number of vertices reachable from the destination vertex of edge *e*. This cost will be significantly less than *V E* for some sparse DAGs (see Exercise 19.65).

• **19.117** Implement an abstract–transitive-closure class for DAGs that uses extra space at most proportional to *V* (and is suitable for huge DAGs). Use topological sorting to provide quick response when the vertices are not connected, and use a source-queue implementation to return the length of the path when they are connected.

• **19.118** Develop a transitive-closure implementation based on a sink-queue– based reverse topological sort (see Exercise 19.105).

• **19.119** Does your solution to Exercise 19.118 require that you examine all edges in the DAG, or are there edges that can be ignored, such as the down edges in DFS? Give an example that requires examination of all edges, or characterize the edges that can be skipped.