Greedy Algorithms for Minimum Spanning Tree Harvey J. Greenberg University of Colorado at Denver http://www.cudenver.edu/hgreenbe/ (url changed December 1, 1998) March 28, 1998 The glossary de nes a spanning tree for a connected graph with non-negative weights on its edges, and one problem: nd a max weight spanning tree. Remarkably, the greedy algorithm results in a solution. Here we present similar greedy algorithms due to Prim [3] and Kruskal [2], respectively, for the problem: nd a min weight spanning tree. Graham and Hell [1] gives a history of the problem, which originated with the work of Czekanowski in 1909. The material here is based on Rosen [4].
The Algorithms We are given a connected graph, G = [V; E ], with n vertices and m edges with non-negative weights, w(ei ). We rst sort the edges such that w(e1 ) : : : w(em). (This takes O(m log m) time.) The output is a spanning tree, T , whose total weight is a minimum. For each algorithm, T is initialized with e1 (an edge with minimum weight) and its two endpoints. The number of vertices in T is denoted v (T ) (which is initialized at 2).
Prim's Algorithm. do while v(T ) < n: interrogate edges (in order) until one is found that is
incident with a vertex in T and does not form a simple circuit in T . Then, add this edge and its endpoint to T (thereby increasing v (T ) by 1). Kruskal's Algorithm. do while v(T ) < n: interrogate edges (in order) until one is found that does not form a simple circuit in T . Then, add this edge and its endpoint(s) to T (thereby increasing v (T ) by 1 or 2). The algorithms dier in that Prim's requires that the next edge added be incident with a vertex in the (partial) tree, T , whereas Kruskal's just adds the next edge that does not form a circuit. To illustrate, we present the progression of Prim's and Kruskal's algorithms for the following graph: a
b
1 3 2
4
3
e 2
3
c
1
1
d
Prim a
b
a
b
a
e
b
a
b
e
e
d
c
d
b
a
b
Kruskal a
b
a
b
a
e
c
d
c
e
d
c
d
They arrive at the same minimum spanning tree whose total weight is 6.
Proof of Optimality Every graph with n vertices, n , 1 edges and no circuits must be a tree; in particular, the graph must be connected, and the algorithms result in a spanning tree (whose minimality is shown below). If the original graph is not connected, Prim's algorithm will nd a minimum spanning tree in the component containing e1 , then it will fail to add any more edges. Kruskal's algorithm will nd a minimum spanning tree for each component. In the following proof of optimality, we assume G is connected, and the algorithm added the edges in the order ei1 ; ei2 ; : : :; ein,1 (note: i1 = 1). For each subset, fei1 ; ei2 ; : : :; eik g, let Tk denote the associated subgraph consisting of those edges plus their endpoints. (In the case of Prim's algorithm, Tk is connected, so it is a tree with v (Tk ) = k + 1.) Choose k to be the maximum integer with the property that a minimum spanning tree exists that contains Tk . (k = 0 means no minimum spanning tree contains e1 , but the following proof shows this cannot happen.) Let T be a minimum spanning tree, with total weight w(T ), that contains Tk (for the maximum k), but k < n , 1 (i.e., T 6= Tn,1 ). Since eik+1 62 T , T [feik+1 g has a simple circuit containing eik+1 . We let ep be any edge in this circuit that was a candidate for Kruskal's algorithm, and we let eq be a candidate for Prim's algorithm. Here is a picture to clarify the notation:
2
ei ei
1
2
Tk T*
{e i k+1} =
is a subtree in Prim’s algorithm.
(Need not be connected in Kruskal’s.)
ei
eq
k
ei
k+1
ep
For Prim’s algorithm, this node is the same as one above.
We then let e0 denote ep or eq , according to which algorithm is executed, and we consider an exchange of eik+1 for e0 . The circuit cannot contain only edges in Tk because that would make eik+1 ineligible, so e0 is not one of the previously selected edges in Tk , which means the exchange results in a new spanning tree, T 0 , with total weight, w(T 0) = w(T ) + w(eik+1 ) , w(e0). Since e0 was a candidate, the rules for adding an edge in either algorithm imply w(eik+1 ) w(e0), so w(T 0) w(T ). Since T is a minimum spanning tree, equality must hold, so we have w(T 0) = w(T ), which means T 0 is also a minimum spanning tree. However, T 0 Tk+1 , which contradicts the maximality of k. The implication of this is that either greedy algorithm (Prim or Kruskal) for the minimum spanning tree problem produces an optimal solution.
References [1] R.L. Graham and P. Hell. On the history of the minimum spanning tree problem. Annals of the History of Computing, 7(1):43{57, 1985. [2] J.B. Kruskal. On the shortest spanning tree of a graph and the traveling salesman problem. Proceedings of the American Mathematical Society, 7:48{50, 1956. [3] R.C. Prim. Shortest connection networks and some generalizations. Bell Systems Technology Journal, 36:1389{1401, 1957. [4] K.H. Rosen. Discrete Mathematics and Its Applications. McGraw-Hill, Inc., New York, NY, third edition, 1995. 3