Representations - Minimum Spanning Trees - Algorithms Third Edition in C++ Part 5. Graph Algorithms (2006)

Algorithms Third Edition in C++ Part 5. Graph Algorithms (2006)

CHAPTER TWENTY
Minimum Spanning Trees

20.1 Representations

In this chapter, we concentrate on weighted undirected graphs—the most natural setting for MST problems. Perhaps the simplest way to begin is to extend the basic graph representations from Chapter 17 to represent weighted graphs as follows: In the adjacency-matrix representation, the matrix can contain edge weights rather than Boolean values; in the adjacency-lists representation, we can add a field for the weights to the list elements that represent edges. This classic approach is appealing in its simplicity, but we will use a different method that is not much more complicated and will make our programs useful in more general settings. Comparative performance characteristics of the two approaches are briefly treated later in this section.

To address the issues raised when we shift from graphs where our only interest is in the presence or absence of edges to graphs where we are interested in information associated with edges, it is useful to imagine scenarios where vertices and edges are objects of unknown complexity, perhaps part of a a huge client data base that is built and maintained by some other application. For example, we might like to view streets, roads, and highways in a map data base as abstract edges with their lengths as weights, but, in the data base, a road’s entry might carry a substantial amount of other information such as its name and type, a detailed description of its physical properties, how much traffic is currently using it, and so forth. Now, a client program

Program 20.1 ADT interface for graphs with weighted edges

This code defines an interface for graphs with weights and other information associated with edges. It consists of an EDGE ADT interface and a templatized GRAPH interface that may be used for any EDGE implementation. GRAPH implementations manipulate edge pointers (provided by clients with insert), not edges. The edge class also includes member functions that give information about the orientation of the edge: Either e->from(v) is true, e->v() is v and e->other(v) is e->w(); or e->from(v) is false, e->w() is v and e->other(v) is e->v().

image

Program 20.2 Example of a graph-processing client function

This function illustrates the use of our weighted-graph interface in Program 20.1. For any implementation of the interface, edges returns a vector containing pointers to all the graph’s edges. As in Chapters 17 through 19, we generally use the iterator functions only in the manner illustrated here.

image

could extract the information that it needs, build a graph, process it, and then interpret the results in the context of the data base, but that might be a difficult and costly process. In particular, it amounts to (at least) making a copy of the graph.

A better alternative is for the client to define an Edge data type and for our implementations to manipulate pointers to edges. Program 20.1 shows the details of the EDGE, GRAPH, and iterator ADTs that we use to support this approach. Clients can easily meet our minimal requirements for the EDGE data type while retaining the flexibility to define application-specific data types for use in other contexts. Our implementations can manipulate edge pointers and use the interface to extract the information they need from EDGEs without regard to how they are represented. Program 20.2 is an example of a client program that uses this interface.

For testing algorithms and basic applications, we use a class EDGE, which has two ints and a double as private data members that are initialized from constructor arguments and are the return values of the v(), w(), and wt() member functions, respectively (see Exercise 20.8). To avoid proliferation of simple types, we use edge weights of type double throughout this chapter and Chapter 21. In our examples, we use edge weights that are real numbers between 0 and 1. This decision does not conflict with various alternatives that we might need in applications, because we can explicitly or implicitly rescale the weights to fit this model (see Exercises 20.1 and 20.10). For example, if the weights are positive integers less than a known maximum value, we can divide them all by the maximum value to convert them to real numbers between 0 and 1.

Program 20.3 Weighted-graph class (adjacency matrix)

For dense weighted graphs, we use a matrix of pointers to Edges, witha pointer to the edge v-w in row v and column w. For undirected graphs, we put another pointer to the edge in row w and column v. The null pointer indicates the absence of an edge; when we remove an edge, we remove our pointer to it. This implementation does not check for parallel edges, but clients could use edge to do so.

image

Program 20.4 Iterator class for adjacency-matrix representation

This code is a straighforward adaptation of Program 17.8 to return edge pointers.

image

If we wanted to do so, we could build a more general ADT interface and use any data type for edge weights that supports addition, subtraction, and comparisons, since we do little more with the weights than to accumulate sums and make decisions based on their values. In Chapter 22, our algorithms are concerned with comparing linear combinations of edge weights, and the running time of some algorithms depends on arithmetic properties of the weights, so we switch to integer weights to allow us to more easily analyze the algorithms.

Programs 20.3 and 20.4 implement the weighted-graph ADT of Program 20.1 with an adjacency-matrix representation. As before, inserting an edge into an undirected graph amounts to storing a pointer to it in two places in the matrix—one for each orientation of the edge. As is true of algorithms that use the adjacency-matrix representation for unweighted graphs, the running time of any algorithm that uses this representation is proportional to V2 (to initialize the matrix) or higher.

With this representation, we test for the existence of an edge v-w by testing whether the pointer in row v and column w is null. In some situations, we could avoid such tests by using sentinel values for the weights, but we avoid the use of sentinel values in our implementations.

Program 20.5 gives the implementation details of the weighted-graph ADT that uses edge pointers for an adjacency-lists representation. A vertex-indexed vector associates each vertex with a linked list of that vertex’s incident edges. Each list node contains a pointer to an edge. As with the adjacency matrix, we could, if desired, save space by putting just the destination vertex and the weight in the list nodes (leaving the source vertex implicit) at the cost of a more complicated iterator (see Exercises 20.11 and 20.14).

At this point, it is worthwhile to compare these representations with the simple representations that were mentioned at the beginning of this section (see Exercises 20.11 and 20.12). If we are building a graph from scratch, using pointers certainly requires more space. Not only do we need space for the pointers, but also we need space for the indices (vertex names), which are implicit in the simple representations. To use edge pointers in the adjacency-matrix representation, we need extra space for V2 edge pointers and E pairs of indices. Similarly, to use edge pointers in the adjacency-list representation we need extra space for E edge pointers and E indices.

On the other hand, the use of edge pointers is likely to lead to faster code, because the compiled client code will follow one pointer to get direct access to an edge’s weight, in contrast to the simple representation, where an Edge needs to be constructed and its fields accessed. If the space cost is prohibitive, using the minimal representations (and perhaps streamlining the iterators to save time) certainly is a reasonable alternative; otherwise, the flexibility afforded by the pointers is probably worth the space.

Program 20.5 Weighted-graph class (adjacency lists)

This implementation of the interface in Program 20.1 is based on adjacency lists and is therefore appropriate for sparse weighted graphs. As with unweighted graphs, we represent each edge with a list node, but here each node contains a pointer to the edge that it represents, not just the destination vertex. The iterator class is a straightforward adaption of Program 17.10 (seeExercise 20.13).

image

To reduce clutter, we do use the simple representations in all of our figures. That is, rather than showing a matrix of pointers to edge structures that contain pairs of integers and weights, we simply show a matrix of weights, and rather than showing list nodes that contain pointers to edge structures, we show nodes that contain destination vertices. The adjacency-matrix and adjacency-lists representations of our sample graph are shown in Figure 20.3.

Figure 20.3 Weighted-graph representations (undirected)

image

The two standard representations of weighted undirected graphs include weights with each edge representation, as illustrated in the adjacency-matrix (left) and adjacency-lists (right) representation of the graph depicted in Figure 20.1. For simplicity in these figures, we show the weights in the matrix entries and list nodes; in our programs we use pointers to client edges. The adjacency matrix is symmetric and the adjacency lists contain two nodes for each edge, as in unweighted directed graphs. Nonexistent edges are represented null pointers in the matrix (indicated by asterisks in the figure) and are not present at all in the lists. Self-loops are absent in both of the representations illustrated here because MST algorithms are simpler without them; other algorithms that process weighted graphs use them (see Chapter 21).

As with our undirected-graph implementations, we do not explicitly test for parallel edges in either implementation. Depending upon the application, we might alter the adjacency-matrix representation to keep the parallel edge of lowest or highest weight, or to effectively coalesce parallel edges to a single edge with the sum of their weights. In the adjacency-lists representation, we allow parallel edges to remain in the data structure, but we could build more powerful data structures to eliminate them using one of the rules just mentioned for adjacency matrices (see Exercise 17.49).

How should we represent the MST itself? The MST of a graph G is a subgraph of G that is also a tree, so we have numerous options. Chief among them are

• A graph

• A linked list of edges

• A vector of pointers to edges

• A vertex-indexed vector with parent links

Figure 20.4 illustrates these options for the example MST in Figure 20.1. Another alternative is to define and use an ADT for trees.

The same tree might have many different representations in any of the schemes. In what order should the edges be presented in the list-of-edges representation? Which node should be chosen as the root in the parent-link representation (see Exercise 20.21)? Generally speaking, when we run an MST algorithm, the particular MST representation that results is an artifact of the algorithm used and does not reflect any important features of the MST.

Figure 20.4 MST representations

image

This figure depicts various representations of the MST in Figure 20.1. The most straightforward is a list of its edges, in no particular order (left). The MST is also a sparse graph, and might be represented with adjacency lists (center). The most compact is a parent-link representation: we choose one of the vertices as the root and keep two vertex-indexed vectors, one with the parent of each vertex in the tree, the other with the weight of the edge from each vertex to its parent (right). The orientation of the tree (choice of root vertex) is arbitrary, not a property of the MST. We can convert from any one of these representations to any other in linear time.

From an algorithmic point of view, the choice of MST representation is of little consequence because we can convert easily from each of these representations to any of the others. To convert from the graph representation to a vector of edges, we can use the GRAPHedges function of Program 20.2. To convert from the parent-link representation in a vector st (with weights in another vector wt) to a vector of pointers to edges in mst, we can use the loop

for (k = 1; k < G.V(); k++)
mst[k] = new EDGE(k, st[k], wt[k]);

This code is for the typical case where the MST is rooted at 0, and it does not put the dummy edge 0-0 onto the MST edge list.

These two conversions are trivial, but how do we convert from the vector-of-edge-pointers representation to the parent-link representation? We have the basic tools to accomplish this task easily as well: We convert to the graph representation using a loop like the one just given (changed to call insert for each edge), then run a a DFS starting at any vertex to compute the parent-link representation of the DFS tree, in linear time.

In short, although the choice of MST representation is a matter of convenience, we package all of our algorithms in a graph-processing class MST that computes a private vector mst of pointers to edges. Depending on the needs of applications, we can implement member functions for this class to return this vector or to give client programs other information about the MST, but we do not specify further details of this interface, other than to include a show member function that calls a similar function for each edge in the MST (see Exercise 20.8).

Exercises

20.8 Write a WeightedEdge class that implements the EDGE interface of Program 20.1 and also includes a member function show that prints edges and their weights in the format used in the figures in this chapter.

20.9 Implement an io class for weighted graphs that has show, scan, and scanEZ member functions (see Program 17.4).

20.10 Build a graph ADT that uses integer weights, but keep track of the minimum and maximum weights in the graph and include an ADT function that always returns weights that are numbers between 0 and 1.

20.11 Give an interface like Program 20.1 such that clients and implementations manipulate Edges (not pointers to them).

20.12 Develop an implementation of your interface from Exercise 20.11 that uses a minimal matrix-of-weights representation, where the iterator function nxt uses the information implicit in the row and column indices to create an Edge to return to the client.

20.13 Implement the iterator class for use with Program 20.5 (see Program 20.4).

20.14 Develop an implementation of your interface from Exercise 20.11 that uses a minimal adjacency-lists representation, where list nodes contain the weight and the destination vertex (but not the source) and the iterator function nxt uses implicit information to create an Edge to return to the client.

20.15 Modify the sparse-random-graph generator in Program 17.12 to assign a random weight (between 0 and 1) to each edge.

20.16 Modify the dense-random-graph generator in Program 17.13 to assign a random weight (between 0 and 1) to each edge.

20.17 Write a program that generates random weighted graphs by connecting vertices arranged in a V -by- V grid to their neighbors (as in Figure 19.3, but undirected) with random weights (between 0 and 1) assigned to each edge.

20.18 Write a program that generates a random complete graph that has weights chosen from a Gaussian distribution.

20.19 Write a program that generates V random points in the plane then builds a weighted graph by connecting each pair of points within a given distance d of one another with an edge whose weight is the distance (see Exercise 17.74). Determine how to set d so that the expected number of edges is E.

20.20 Find a large weighted graph online—perhaps a map with distances, telephone connections with costs, or an airline rate schedule.

20.21 Write down an 8-by-8 matrix that contains parent-link representations of all the orientations of the MST of the graph in Figure 20.1. Put the parent-link representation of the tree rooted at i in the ith row of the matrix.

20.22 Assume that a MST class constructor produces a vector-of-edge-pointers representation of an MST in mst[1] through mst[V]. Add a member function ST for clients (as in, for example, Program 18.3) such that ST(v) returns v’s parent in the tree (v if it is the root).

20.23 Under the assumptions of Exercise 20.22, write a member function that returns the total weight of the MST.

20.24 Suppose that a MST class constructor produces a parent-link representation of an MST in a vector st. Give code to be added to the constructor to compute a vector-of-edge-pointers representation in entries 1 through V of a private vector mst.

20.25 Define a TREE class. Then, under the assumptions of Exercise 20.22, write a member function that returns a TREE.