# Algorithms Third Edition in C++ Part 5. Graph Algorithms (2006)

### CHAPTER TWENTY

Minimum Spanning Trees

GRAPH MODELS WHERE we associate *weights* or *costs* with each edge are called for in many applications. In an airline map where edges represent flight routes, these weights might represent distances or fares. In an electric circuit where edges represent wires, the weights might represent the length of the wire, its cost, or the time that it takes a signal to propagate through it. In a job-scheduling problem, weights might represent time or the cost of performing tasks or of waiting for tasks to be performed.

Questions that entail cost minimization naturally arise for such situations. We examine algorithms for two such problems: (*i*) find the lowest-cost way to connect all of the points, and (*ii*) find the lowest-cost path between two given points. The first type of algorithm, which is useful for undirected graphs that represent objects such as electric circuits, finds a *minimum spanning tree*; it is the subject of this chapter. The second type of algorithm, which is useful for digraphs that represent objects such as an airline route map, finds the *shortest paths*; it is the subject of *Chapter 21*. These algorithms have broad applicability beyond circuit and map applications, extending to a variety of problems that arise on weighted graphs.

When we study algorithms that process weighted graphs, our intuition is often supported by thinking of the weights as distances: We speak of “the vertex closest to *x*,” and so forth. Indeed, the term “shortest path” embraces this bias. Despite numerous applications where we actually do work with distance and despite the benefits of geometric intuition in understanding the basic algorithms, it is important to remember that the weights do not need to be proportional to a distance at all; they might represent time or cost or an entirely different variable. Indeed, as we see in *Chapter 21*, weights in shortest-paths problems can even be *negative*.

To appeal to intuition in describing algorithms and examples while still retaining general applicability, we use ambiguous terminology where we refer interchangeably to edge lengths and weights. When we refer to a “short” edge, we mean a “low-weight” edge, and so forth. For most of the examples in this chapter, we use weights that are proportional to the distances between the vertices, as shown in *Figure 20.1*. Such graphs are more convenient for examples, because we do not need to carry the edge labels and can still tell at a glance that longer edges have weights higher than those of shorter edges. When the weights do represent distances, we can consider algorithms that gain efficiency by taking into account geometric properties (*Sections 20.7* and *21.5*). With that exception, the algorithms that we consider simply process the edges and do not take advantage of any implied geometric information (see *Figure 20.2*).

The problem of finding the minimum spanning tree of an arbitrary weighted undirected graph has numerous important applications, and algorithms to solve it have been known since at least the 1920s; but the efficiency of implementations varies widely, and researchers still seek better methods. In this section, we examine three classical algorithms that are easily understood at a conceptual level; in *Sections 20.3* through *20.5*, we examine implementations of each in detail; and in *Section 20.6*, we consider comparisons of and improvements on these basic approaches.

**Definition 20.1** *A* **minimum spanning tree (MST)** *of a weighted graph is a spanning tree whose weight (the sum of the weights of its edges) is no larger than the weight of any other spanning tree.*

If the edge weights are all positive, it suffices to define the MST as the set of edges with minimal total weight that connects all the vertices, as such a set of edges must form a spanning tree. The spanning-tree condition in the definition is included so that it applies for graphs that may have negative edge weights (see *Exercises 20.2* and *20.3*).

**Figure 20.1 A weighted undirected graph and its MST**

A weighted undirected graph is a set of weighted edges. The MST is a set of edges of minimal total weight that connects the vertices (black in the edge list, thick edges in the graph drawing). In this particular graph, the weights are proportional to the distances between the vertices, but the basic algorithms that we consider are appropriate for general graphs and make no assumptions about the weights (see *Figure 20.2*).

If edges can have equal weights, the minimum spanning tree may not be unique. For example, *Figure 20.2* shows a graph that has two different MSTs. The possibility of equal weights also complicates the descriptions and correctness proofs of some of our algorithms. We have to consider equal weights carefully, because they are not unusual in applications and we want to know that our algorithms operate correctly when they are present.

Not only might there be more than one MST, but also the nomenclature does not capture precisely the concept that we are minimizing the weight rather than the tree itself. The proper adjective to describe a specific tree is *minimal* (one having the smallest weight). For these reasons, many authors use more accurate terms like *minimal spanning tree* or *minimum-weight spanning tree*. The abbreviation *MST*, which we shall use most often, is universally understood to capture the basic concept.

Still, to avoid confusion when describing algorithms for networks that may have edges with equal weights, we do take care to be precise to use the term “minimal” to refer to “an edge of minimum weight” (among all edges in some specified set) and “maximal” to refer to “an edge of maximum weight.” That is, if edge weights are distinct, a minimal edge is the shortest edge (and is the only minimal edge); but if there is more than one edge of minimum weight, any one of them might be a minimal edge.

We work exclusively with undirected graphs in this chapter. The problem of finding a minimum-weight directed spanning tree in a di-graph is different, and is more difficult.

Several classical algorithms have been developed for the MST problem. These methods are among the oldest and most well-known algorithms in this book. As we have seen before, the classical methods provide a general approach, but modern algorithms and data structures can give us compact and efficient implementations. Indeed, these implementations provide a compelling example of the effectiveness of careful ADT design and proper choice of fundamental ADT data structure and algorithm implementations in solving increasingly diffi-cult algorithmic problems.

**Exercises**

**20.1** Assume that the weights in a graph are positive. Prove that you can rescale them by adding a constant to all of them or by multiplying them all by a constant without affecting the MSTs, provided only that the rescaled weights are positive.

**Figure 20.2 Arbitrary weights**

In this example, the edge weights are arbitrary and do not relate to the geometry of the drawn graph representation at all. This example also illustrates that the MST is not necessarily unique if edge weights may be equal: we get one MST by including 3-4 (shown) and a different MST by including 0-5 instead (although 7-6, which has the same weight as those two edges, is not in any MST).

**20.2** Show that, if edge weights are positive, a set of edges that connects all the vertices whose weights sum to a quantity no larger than the sum of the weights of any other set of edges that connects all the vertices is an MST.

**20.3** Show that the property stated in *Exercise 20.2* holds for graphs with negative weights, provided that there are no cycles whose edges all have non-positive weights.

• **20.4** How would you find a *maximum* spanning tree of a weighted graph?

• **20.5** Show that if a graph’s edges all have distinct weights, the MST is unique.

• **20.6** Consider the assertion that a graph has a unique MST only if its edge weights are distinct. Give a proof or a counterexample.

• **20.7** Assume that a graph has *t < V* edges with equal weights and that all other weights are distinct. Give upper and lower bounds on the number of different MSTs that the graph might have.