Thinking in Minimum Spanning Trees

Minimum Spanning Trees are subgraphs of a graph that are TREES and connect all vertices together such the total cost of connection is minimal. Because of that, they are interesting objects.

Minimum Spanning Tree

Minimum spanning trees have many application, from routing in movement application to traversing of databases. My favorite is to define use in game AIs, by computing cost of travel for characters according to the kind of terrain they have to travel through.

A Background on Spanning Trees.

First, TREES, What are they? For any two given nodes have only one path between them and all nodes have one path between them. In 1857, Cayley coined the term TREE and it stuck though many graph TREES don’t have any resemblance to real trees.
Trees have no cycles, thus, no transitivity, no triangles, no clustering coefficient. Trees are efficient when the cost of establishing connections is high (think for example subway lines that are costly to build).
Minimum Spanning Trees are the optimisation of the COST of trees. If the graph is unweighted the cost is unitary and all edges are equal — and one can have multiple equivalent Minimum Spanning Trees. If the graph if weighted then you’ll have a unique Minimum Spanning Tree (usually).

MST Algorithm

An ALGORITHM to compute minimum spanning trees for a weighted undirected graph is the Prim’s Algorithm — although it was proposed 27 years earlier by a Czech mathematician called Vojtěch Jarník and the algorithm should be named after him (sometimes it is).

Here’s an example how the algorithm runs:

Spanning trees are very important because they are used in pathfinding algorithms like in the Dijkstra’s algorithm or in the A* Algorithm.

Spanning Tree in R and igraph

In R, the igraph package implements the Prim’s algorithm when computing minimum spanning trees.

A simple example of the use of the minimum spanning tree in R follows:

library(igraph)
g <- erdos.renyi.game(20, 0.25)
E(g)$weight  <- runif(length(E(g)), 0., 1.)
E(g)$width  <- 3.0*E(g)$weight
 
mst <- minimum.spanning.tree(g)
E(mst)$color <- 'red'
 
layout  <- layout.kamada.kawai(mst)
 
plot(g, layout=layout)
par(new=T)
E(mst)$width  <-  3
plot(mst, layout=layout)
par(new=F)

The image that illustrates this post on minimum spanning trees is one run of the short code above. I generated a random graph with 20 nodes and a probability of connecting any two of 25%. Then I gave the edges random weights and computed the weighted Minimum Spanning Tree. The trick of how to plot a minimum spanning tree superimposed on the original graph is to do it in a two-step plot. First you plot the graph with a pre-computed layout. Then you use the same pre-computed layout to plot the second layer that holds the minimum spanning tree.

You need to set the parameter new=True so that the R plot doesn’t erase the previous one. You can compute the layout for the minimum spanning tree if you want edges not to overlap and obtain a clear planar graph. iGraph offers many different graph layouts to experiment from. Kamada Kawai usually works very well for minimum spanning trees.