Why does the greedy algorithm work for trees? — What matroids reveal

Kruskal's algorithm for minimum spanning trees is absurdly simple: sort the edges by weight and add the cheapest one that does not create a cycle. Why does this greedy strategy produce an optimal solution? The answer lies in the cut property and, more deeply, in matroids.

Folio Official

March 1, 2026

In an algorithms course, the minimum spanning tree (MST) problem brings a moment of genuine wonder. Kruskal's algorithm is astonishingly simple: "examine edges in order of increasing weight; accept an edge if it does not create a cycle." That is all.

But a moment's reflection should make you uneasy. Greedy algorithms do not, in general, produce optimal solutions. So why does the greedy approach work for MSTs? This article gives an intuitive answer, first through the cut property, and then through the deeper structure of matroids.

1. The minimum spanning tree problem

Definition 1 (Minimum spanning tree).

Given a connected weighted graph

G = (V, E, w)

, a spanning tree

T

whose total weight

\sum_{e \in T} w (e)

is minimized is called a minimum spanning tree (MST).

In this example, choosing edges ${A, B}$ (weight 1), ${B, C}$ (weight 2), and ${C, D}$ (weight 3) yields an MST of total weight 6. Kruskal's algorithm produces exactly this selection.

2. Kruskal's algorithm – greedy in its purest form

The algorithm is disarmingly simple:

Sort all edges in non-decreasing order of weight.
Iterate through the sorted edges; accept an edge if adding it does not create a cycle.
Stop after accepting $n - 1$ edges.

Remark 2.

The "does it create a cycle?" check can be performed in

O (α (n))

(effectively constant time) using a Union-Find data structure. The overall running time is dominated by the initial sort:

O (∣ E ∣ lo g ∣ E ∣)

The real question, however, is why does this naive strategy produce an optimal solution?

3. When greedy fails – the knapsack problem

To appreciate why the MST result is remarkable, let us first see a case where greediness fails.

Example 3 (0-1 knapsack problem).

A knapsack has capacity

W = 10

. The available items are:

Item A: weight 6, value 8 (value density $\approx 1.33$ )
Item B: weight 5, value 7 (value density $= 1.4$ )
Item C: weight 5, value 6 (value density $= 1.2$ )

A greedy strategy that picks items by highest value density selects B (weight 5, value 7), cannot fit A, then takes C (weight 5, value 6), for a total value of 13. But selecting A and C yields a total value of 14. The greedy algorithm misses the optimum.

The fundamental reason greedy fails here is that the locally best choice need not be globally best. So why does this trap not ensnare the MST problem?

4. The cut property – the intuitive reason greedy works

The most intuitive explanation for the correctness of Kruskal's algorithm is the cut property.

Theorem 4 (Cut property).

Let

(S, V ∖ S)

be any partition of the vertex set of a graph

G

(such a partition is called a cut). Among all edges crossing the cut, the one with the smallest weight belongs to some MST of

G

Remark 5 (Intuition for the cut property).

A cut is a line that slices the graph into two pieces. To reconnect the two pieces, at least one crossing edge is needed. If we want to reconnect at minimum cost, we should choose the lightest crossing edge – and the optimality of this choice is intuitively clear.

At the moment Kruskal's algorithm considers an edge ${u, v}$ , if $u$ and $v$ lie in different connected components, those two components define a cut. Because edges are processed in non-decreasing order of weight, the first edge encountered that crosses this cut is the lightest such edge. The cut property therefore guarantees its inclusion in an MST.

5. Matroids – the structure behind greedy optimality

There is, in fact, a deeper mathematical reason that the greedy algorithm succeeds on MSTs.

Definition 6 (Matroid).

A pair

(E, I)

consisting of a finite set

E

and a collection

I \subseteq 2^{E}

of subsets is a matroid if:

$\emptyset \in I$ (the empty set is independent).
If $A \in I$ and $B \subseteq A$ , then $B \in I$ (the hereditary property).
If $A, B \in I$ and $∣ A ∣ < ∣ B ∣$ , then there exists $x \in B ∖ A$ such that $A \cup {x} \in I$ (the exchange axiom).

Elements of

I

are called independent sets.

Example 7 (Graphic matroid).

Given a graph

G = (V, E)

, declare a subset of

E

to be independent if it contains no cycle (i.e., it forms a forest). The resulting pair

(E, I)

is a matroid, called the graphic matroid.

Theorem 8 (Greedy optimality on matroids).

(E, I)

is a matroid and each element

e \in E

is assigned a weight

w (e)

, then the greedy algorithm – examining elements in non-decreasing order of weight and accepting each one that preserves independence – produces a minimum-weight maximal independent set.

Remark 9 (Why matroids make greedy work).

The exchange axiom is the key. It guarantees that a smaller independent set can always be augmented toward a larger one: there are no dead ends. No matter how the greedy process builds up its solution, it is never painted into a corner.

In the knapsack problem, choosing item B first makes it impossible to add item A due to the capacity constraint. This corresponds to a failure of the exchange axiom. In a matroid, such "gridlock" is structurally impossible, which is why the greedy algorithm can always reach an optimal solution.

6. The correctness of Kruskal's algorithm – a summary

We can now understand why Kruskal's algorithm is optimal at two levels:

Cut property level: Because edges are processed in non-decreasing order of weight, the lightest edge across each cut is always selected.
Matroid level: Forests (acyclic edge sets) form a matroid, and the greedy algorithm is always optimal on matroids.

Remark 10.

Interestingly, Prim's algorithm also computes MSTs, but it applies the greedy idea in a different direction. Kruskal selects the globally lightest safe edge, while Prim selects the lightest edge incident to the current tree. Both exploit the cut property (in different ways) and both arrive at an optimal solution.

In the next article, we examine the most dramatic complexity gap in graph theory: the chasm between Euler and Hamilton.

Graph Theory Mathematics Between the Lines Matroids Greedy Algorithm Minimum Spanning Trees

Folio Official

Mathematics "between the lines" — exploring the intuition textbooks leave out, written in LaTeX on Folio.

1 followers·107 articles

Why does the greedy algorithm work for trees? — What matroids reveal

Folio Official

March 1, 2026

1. The minimum spanning tree problem

Definition 1 (Minimum spanning tree).

Given a connected weighted graph

G = (V, E, w)

, a spanning tree

T

whose total weight

\sum_{e \in T} w (e)

is minimized is called a minimum spanning tree (MST).

In this example, choosing edges ${A, B}$ (weight 1), ${B, C}$ (weight 2), and ${C, D}$ (weight 3) yields an MST of total weight 6. Kruskal's algorithm produces exactly this selection.

2. Kruskal's algorithm – greedy in its purest form

The algorithm is disarmingly simple:

Sort all edges in non-decreasing order of weight.
Iterate through the sorted edges; accept an edge if adding it does not create a cycle.
Stop after accepting $n - 1$ edges.

Remark 2.

The "does it create a cycle?" check can be performed in

O (α (n))

(effectively constant time) using a Union-Find data structure. The overall running time is dominated by the initial sort:

O (∣ E ∣ lo g ∣ E ∣)

The real question, however, is why does this naive strategy produce an optimal solution?

3. When greedy fails – the knapsack problem

To appreciate why the MST result is remarkable, let us first see a case where greediness fails.

Example 3 (0-1 knapsack problem).

A knapsack has capacity

W = 10

. The available items are:

Item A: weight 6, value 8 (value density $\approx 1.33$ )
Item B: weight 5, value 7 (value density $= 1.4$ )
Item C: weight 5, value 6 (value density $= 1.2$ )

The fundamental reason greedy fails here is that the locally best choice need not be globally best. So why does this trap not ensnare the MST problem?

4. The cut property – the intuitive reason greedy works

The most intuitive explanation for the correctness of Kruskal's algorithm is the cut property.

Theorem 4 (Cut property).

Let

(S, V ∖ S)

be any partition of the vertex set of a graph

G

(such a partition is called a cut). Among all edges crossing the cut, the one with the smallest weight belongs to some MST of

G

Remark 5 (Intuition for the cut property).

5. Matroids – the structure behind greedy optimality

There is, in fact, a deeper mathematical reason that the greedy algorithm succeeds on MSTs.

Definition 6 (Matroid).

A pair

(E, I)

consisting of a finite set

E

and a collection

I \subseteq 2^{E}

of subsets is a matroid if:

$\emptyset \in I$ (the empty set is independent).
If $A \in I$ and $B \subseteq A$ , then $B \in I$ (the hereditary property).
If $A, B \in I$ and $∣ A ∣ < ∣ B ∣$ , then there exists $x \in B ∖ A$ such that $A \cup {x} \in I$ (the exchange axiom).

Elements of

I

are called independent sets.

Example 7 (Graphic matroid).

Given a graph

G = (V, E)

, declare a subset of

E

to be independent if it contains no cycle (i.e., it forms a forest). The resulting pair

(E, I)

is a matroid, called the graphic matroid.

Theorem 8 (Greedy optimality on matroids).

(E, I)

is a matroid and each element

e \in E

is assigned a weight

w (e)

, then the greedy algorithm – examining elements in non-decreasing order of weight and accepting each one that preserves independence – produces a minimum-weight maximal independent set.

Remark 9 (Why matroids make greedy work).

6. The correctness of Kruskal's algorithm – a summary

We can now understand why Kruskal's algorithm is optimal at two levels:

Cut property level: Because edges are processed in non-decreasing order of weight, the lightest edge across each cut is always selected.
Matroid level: Forests (acyclic edge sets) form a matroid, and the greedy algorithm is always optimal on matroids.

Remark 10.

In the next article, we examine the most dramatic complexity gap in graph theory: the chasm between Euler and Hamilton.

Graph Theory Mathematics Between the Lines Matroids Greedy Algorithm Minimum Spanning Trees

Folio Official

Mathematics "between the lines" — exploring the intuition textbooks leave out, written in LaTeX on Folio.

1 followers·107 articles

Why does the greedy algorithm work for trees? — What matroids reveal

1. The minimum spanning tree problem

2. Kruskal's algorithm – greedy in its purest form

3. When greedy fails – the knapsack problem

4. The cut property – the intuitive reason greedy works

5. Matroids – the structure behind greedy optimality

6. The correctness of Kruskal's algorithm – a summary

Share your expertise with the world

More from Folio Official

What the max-flow min-cut theorem is really saying — Duality in action

Why are trees special? — The beauty of minimal connectivity

The magic of bipartite graphs — Why everything works

Matroids and the Greedy Algorithm

Why does the greedy algorithm work for trees? — What matroids reveal

1. The minimum spanning tree problem

2. Kruskal's algorithm – greedy in its purest form

3. When greedy fails – the knapsack problem

4. The cut property – the intuitive reason greedy works

5. Matroids – the structure behind greedy optimality

6. The correctness of Kruskal's algorithm – a summary

Share your expertise with the world

More from Folio Official

What the max-flow min-cut theorem is really saying — Duality in action

Why are trees special? — The beauty of minimal connectivity

The magic of bipartite graphs — Why everything works

Matroids and the Greedy Algorithm