Folioby Interconnected
Log InSign Up

Why the Catalan numbers appear everywhere: parenthesizations, binary trees, and lattice paths

The Catalan number $C_n = \frac{1}{n+1}\binom{2n}{n}$ answers questions about balanced parentheses, binary trees, lattice paths, polygon triangulations, and ballot sequences. Through the reflection principle and explicit bijections, we explain why all these problems share the same answer.

FO
Folio Official
March 1, 2026

Combinatorics has its celebrities – numbers that show up again and again in problems that bear no apparent relation to one another. The undisputed star is the Catalan number. Balanced parenthesizations, binary tree shapes, lattice paths that stay below the diagonal, triangulations of convex polygons, ballot sequences – solve any one of these counting problems and the same sequence 1,1,2,5,14,42,132,… emerges. Why?

1 Definition

Definition 1 (Catalan number).
The n-th Catalan number is
Cn​=n+11​(n2n​).
The first several values are C0​=1, C1​=1, C2​=2, C3​=5, C4​=14, C5​=42.
Remark 2.
Since (n2n​) is an integer, one might wonder why dividing by n+1 still yields an integer. The combinatorial interpretations make integrality obvious – the number of binary trees with n internal nodes must be a non-negative integer – but an algebraic proof requires a small argument. The identity Cn​=(n2n​)−(n+12n​), which expresses Cn​ as the difference of two binomial coefficients, settles the matter.

2 Five combinatorial incarnations

The Catalan number Cn​ is simultaneously the answer to each of the following problems.

Example 3 (Balanced parenthesizations).
The number of ways to arrange n pairs of parentheses so that they are correctly matched is Cn​. For n=3, the five valid strings are ((())), (()()), (())(), ()(()), ()()(), confirming C3​=5.
Example 4 (Binary trees).
The number of distinct binary trees with n internal nodes is Cn​.
Example 5 (Lattice paths below the diagonal).
The number of shortest lattice paths from (0,0) to (n,n) that never cross above the diagonal y=x (i.e., satisfying y≤x at every step) is Cn​.
Example 6 (Triangulations of a convex polygon).
The number of ways to dissect a convex (n+2)-gon into triangles using non-crossing diagonals is Cn​.
Example 7 (Ballot sequences).
The number of sequences of 2n steps – n up-steps (↗) and n down-steps (↘) – that never dip below the starting height is Cn​. This is a visual rendering of the balanced parenthesization problem.

3 Derivation via the reflection principle

We derive the Catalan formula from the lattice path interpretation.

Theorem 8.
The number of shortest lattice paths from (0,0) to (n,n) that never cross above the diagonal is Cn​=n+11​(n2n​).
Proof.
Without any constraint, the total number of shortest paths from (0,0) to (n,n) is (n2n​) (choose which n of the 2n steps go right).

A "bad" path – one that crosses above the diagonal – must touch the line y=x+1 at some point. At the first such touching point, reflect the remainder of the path across the line y=x+1.

This reflection establishes a bijection between bad paths from (0,0) to (n,n) and all paths from (0,0) to (n−1,n+1), so the number of bad paths is (n−12n​).

The desired count is therefore
(n2n​)−(n−12n​)=(n2n​)−n+1n​(n2n​)=n+11​(n2n​)=Cn​.
□
Remark 9.
The power of the reflection principle lies in the construction of a bijection. Rather than counting bad paths directly, we map them one-to-one onto paths with a different endpoint, whose count we already know. This technique extends to a wide variety of lattice path problems.

4 Why the same number: bijections between interpretations

The reason Cn​ appears in so many guises is that there exist explicit bijections between the solution sets of these problems.

Remark 10.
For instance, the bijection between balanced parenthesizations with n pairs and lattice paths below the diagonal from (0,0) to (n,n) is constructed as follows:
  • Map each opening parenthesis "(" to a rightward step →.

  • Map each closing parenthesis ")" to an upward step ↑.

  • The condition that the parenthesization is balanced is equivalent to the condition that the path never crosses above the diagonal.

5 Derivation via generating functions

Theorem 11.
The generating function C(x)=∑n=0∞​Cn​xn of the Catalan numbers is
C(x)=2x1−1−4x​​.
Proof.
Consider the recursive structure of binary trees. Every non-empty binary tree decomposes as a root together with a left subtree and a right subtree. If the left subtree has k internal nodes and the right subtree has n−1−k, then
Cn​=k=0∑n−1​Ck​Cn−1−k​(n≥1),C0​=1.

Translating into generating functions: C(x)=1+xC(x)2, where the 1 accounts for C0​ and the factor xC(x)2 encodes the recursive decomposition.

Solving the quadratic xC(x)2−C(x)+1=0 gives
C(x)=2x1±1−4x​​.

Since C(0)=C0​=1 must be finite, the + sign produces a singularity, so the correct branch is C(x)=2x1−1−4x​​. □

6 Takeaway

  • The Catalan number Cn​=n+11​(n2n​) admits at least five distinct combinatorial interpretations.

  • The reflection principle is a powerful technique for counting lattice paths by bijecting "bad" paths to paths with a known endpoint.

  • The reason the same number appears in different problems is the existence of explicit bijections between their solution sets.

  • The generating function C(x)=2x1−1−4x​​ arises from the recursive structure of binary trees.

In the next article, we turn to a deceptively simple principle that proves the existence of objects without ever exhibiting them: the pigeonhole principle.

CombinatoricsMathematicsBetween the Lines
FO
Folio Official

Mathematics "between the lines" — exploring the intuition textbooks leave out, written in LaTeX on Folio.

1 followers·107 articles
Combinatorics — Between the LinesPart 5 of 8
Previous
Recurrences and generating functions: what happens when you turn a sequence into a function
Next
The pigeonhole principle: a surprisingly powerful tool for existence proofs

Share your expertise with the world

Write articles with LaTeX support, build your audience, and earn from your knowledge.

Start Writing — It's Free

More from Folio Official

Folio Official·March 1, 2026

Recurrences and generating functions: what happens when you turn a sequence into a function

A generating function is the "business card" of a sequence. Multiplication becomes convolution, and recurrences become algebraic equations. We derive the closed form for the Fibonacci numbers (Binet's formula) as a showcase of the method.

CombinatoricsMathematicsBetween the Lines
Folio Official·March 1, 2026

Combinatorics meets dynamic programming: when counting becomes computation

Dynamic programming is bottom-up evaluation of a recurrence; a generating function is its analytic closed form. We draw the correspondence between the knapsack DP and generating function multiplication, discuss convolution and FFT, and survey the connections across the entire series.

CombinatoricsMathematicsBetween the Lines
Folio Official·March 1, 2026

Inclusion--exclusion: why the alternating signs work

The inclusion--exclusion principle corrects for overcounting by alternately adding and subtracting intersection sizes. We prove the general formula using indicator functions, apply it to count derangements, and trace the thread forward to the Mobius inversion formula on partially ordered sets.

CombinatoricsMathematicsBetween the Lines
Folio Official·March 1, 2026

Counting under symmetry: Burnside's lemma and Polya's enumeration theorem

When objects related by rotation or reflection are considered identical -- as with necklaces -- naive counting fails. Using the language of group actions, Burnside's lemma reduces the problem to a clean formula: the number of distinct objects equals the average number of fixed points across the group.

CombinatoricsMathematicsBetween the Lines