Data Structures and Algorithms
Proving Greedy Algorithms

The very nature of greedy algorithms makes them difficult to prove. We choose the step that maximises the immediate gain (in the case of the minimum spanning tree - made the smallest possible addition to the total cost so far) without thought for the effect of this choice on the remainder of the problem.

So the commonest method of proving a greedy algorithm is to use proof by contradiction, we show that if we didn't make the "greedy" choice, then, in the end, we will find that we should have made that choice.

The Minimum Spanning Tree Algorithm

At each step in the MST algorithm, we choose the cheapest edge that would not create a cycle.

We can easily establish that any edge creating a cycle should not be added. The cycle-completing edge is more expensive than any previously added edge and the nodes which it connects are already joined by some path. Thus it is redundant and can be left out.

Each edge that we add must join two sub-trees. If the next cheapest edge, ex, would join two sub-trees, Ta and Tb, then we must, at some later stage, use a more expensive edge, ey, to join Ta to Tb, either directly or by joining a node of one of them to a node that is now connected to the other. But we can join Ta to Tb (and any nodes which are now connected to them) more cheaply by using ex, which proves the proposition that we should choose the cheapest edge at each stage.

Complexity

The steps in Kruskal's algorithm are:
Initialise the forest
 
O(|V|)
Sort the edges
 
O(|E|log|E|)
Until we've added |V|-1 edgesO(|V|) x
Check whether an edge
forms a cycle
O(|V|) = O(|V|2)
Total
 
O(|V|+|E|log|E|+|V|2)
Since |E|=O(|V|2)
 
O(|V|2log|V|)

Thus, we may refer to Kruskal's algorithm as an O(n2log n) algorithm, where n is the number of vertices. However, note that if |E| is similar to |V|, then the complexity is O(n2).

The alternative MST algorithm, Prim's algorithm, can be made to run in O(|E| + |V|log|V|) time, using Fibonacci heaps.

Because of its wide practical application, the MST problem has been extensively studied and, for sparse (|E| approx= |V|) graphs, an even faster algorithm O(|E|log log|V|) is known (cf Fredman and Tarjan, "Fibonacci heaps and their uses in improved network optimization algorithms", JACM, 34(3), 596-615(1987).)

This emphasizes the point that no good software engineer tries to re-invent wheels, he keeps a good algorithms text in his library and makes sure to refer to it before attempting to program a new problem!

Return to Minimum Spanning Tree Back to the Table of Contents
© , 1998