Union-Find (Disjoint Set)

intermediate union-find disjoint-set graph data-structure

Union-Find (also called Disjoint Set Union, DSU) is a data structure that tracks which elements belong to the same group. It supports two operations: find (which group is this element in?) and union (merge two groups together).

In simple language, imagine people at a party forming friend groups. Union-Find lets us quickly answer: “Are Alice and Bob in the same friend group?” and “Merge Alice’s group with Bob’s group.”

Why Not Just Use BFS/DFS?

We could use BFS/DFS for connected components. But Union-Find shines when:

  • Edges arrive one at a time (streaming/online)
  • We need to repeatedly check if two nodes are connected
  • We’re building connections incrementally (like Kruskal’s MST)

With Union-Find, each operation is nearly O(1). With BFS/DFS, we’d need to re-traverse the whole graph each time.

How It Works

Union-Find: Tree-based Groups
Initially: each node is its own parent (own group)
parent[0]=0 0
parent[1]=1 1
parent[2]=2 2
parent[3]=3 3
After union(0,1) and union(2,3):
0 1
2 3
Two groups: {0,1} and {2,3}. Roots are 0 and 2.
After union(1,3) — merges the two groups:
0
1
2 3
One group: {0,1,2,3}. find(3) → 2 → 0 (root).

Find follows parent pointers up to the root. Two nodes are in the same group if they have the same root.

Union connects two roots, merging their groups.

The Two Optimizations

Without optimizations, trees can become long chains (O(n) per find). Two tricks make it nearly O(1):

  1. Path Compression — During find, make every node point directly to the root. Flattens the tree.
  2. Union by Rank — Always attach the shorter tree under the taller one. Keeps trees balanced.

Together, they give us O(α(n)) per operation — that’s the inverse Ackermann function, which is effectively constant for any practical input.

Full Implementation

// Union-Find with path compression + union by rank
class UnionFind {
  constructor(n) {
    this.parent = Array.from({ length: n }, (_, i) => i);
    this.rank = Array(n).fill(0);
    this.count = n; // number of connected components
  }
  find(x) {
    if (this.parent[x] !== x)
      this.parent[x] = this.find(this.parent[x]); // path compression
    return this.parent[x];
  }
  union(x, y) {
    const px = this.find(x), py = this.find(y);
    if (px === py) return false; // already connected
    if (this.rank[px] < this.rank[py]) this.parent[px] = py;
    else if (this.rank[px] > this.rank[py]) this.parent[py] = px;
    else { this.parent[py] = px; this.rank[px]++; }
    this.count--; // one less component
    return true;
  }
  connected(x, y) { return this.find(x) === this.find(y); }
}
# Union-Find with path compression + union by rank
class UnionFind:
    def __init__(self, n):
        self.parent = list(range(n))
        self.rank = [0] * n
        self.count = n  # number of connected components

    def find(self, x):
        if self.parent[x] != x:
            self.parent[x] = self.find(self.parent[x])  # path compression
        return self.parent[x]

    def union(self, x, y):
        px, py = self.find(x), self.find(y)
        if px == py:
            return False  # already connected
        if self.rank[px] < self.rank[py]:
            px, py = py, px  # ensure px has higher rank
        self.parent[py] = px
        if self.rank[px] == self.rank[py]:
            self.rank[px] += 1
        self.count -= 1
        return True

    def connected(self, x, y):
        return self.find(x) == self.find(y)
// Union-Find with path compression + union by rank
class UnionFind {
    int[] parent, rank;
    int count;
    UnionFind(int n) {
        parent = new int[n]; rank = new int[n]; count = n;
        for (int i = 0; i < n; i++) parent[i] = i;
    }
    int find(int x) {
        if (parent[x] != x) parent[x] = find(parent[x]); // path compression
        return parent[x];
    }
    boolean union(int x, int y) {
        int px = find(x), py = find(y);
        if (px == py) return false;
        if (rank[px] < rank[py]) { int tmp = px; px = py; py = tmp; }
        parent[py] = px;
        if (rank[px] == rank[py]) rank[px]++;
        count--;
        return true;
    }
    boolean connected(int x, int y) { return find(x) == find(y); }
}

Problem: Number of Connected Components

Given n nodes and a list of edges, count the connected components. Classic Union-Find problem.

// Count connected components using Union-Find
function countComponents(n, edges) {
  const uf = new UnionFind(n);
  for (const [u, v] of edges) uf.union(u, v);
  return uf.count; // that's it!
}
# Count connected components using Union-Find
def count_components(n, edges):
    uf = UnionFind(n)
    for u, v in edges:
        uf.union(u, v)
    return uf.count  # that's it!
// Count connected components using Union-Find
int countComponents(int n, int[][] edges) {
    UnionFind uf = new UnionFind(n);
    for (int[] e : edges) uf.union(e[0], e[1]);
    return uf.count; // that's it!
}

Problem: Redundant Connection

Given a graph that was originally a tree plus one extra edge, find that extra edge. The first edge that connects two already-connected nodes is our answer.

// Find the redundant edge (one that creates a cycle)
function findRedundantConnection(edges) {
  const uf = new UnionFind(edges.length + 1);
  for (const [u, v] of edges) {
    if (!uf.union(u, v)) return [u, v]; // already connected = cycle!
  }
}
# Find the redundant edge (one that creates a cycle)
def find_redundant_connection(edges):
    uf = UnionFind(len(edges) + 1)
    for u, v in edges:
        if not uf.union(u, v):
            return [u, v]  # already connected = cycle!
// Find the redundant edge (one that creates a cycle)
int[] findRedundantConnection(int[][] edges) {
    UnionFind uf = new UnionFind(edges.length + 1);
    for (int[] e : edges)
        if (!uf.union(e[0], e[1])) return e; // already connected = cycle!
    return new int[0];
}

The trick: process edges one by one. If union returns false, both nodes are already in the same group — adding this edge creates a cycle.

When Union-Find vs BFS/DFS?

ScenarioBest Choice
Static graph, single traversalBFS/DFS
Edges added incrementallyUnion-Find
Repeated “are X and Y connected?” queriesUnion-Find
Need shortest pathBFS/DFS
Kruskal’s MSTUnion-Find
Number of islands (grid)Either (DFS is simpler)

Complexity

  • Find: O(α(n)) — nearly constant
  • Union: O(α(n)) — nearly constant
  • Space: O(n) for parent and rank arrays

Union-Find is one of those data structures that seems niche until we realize how many problems it solves elegantly. Connected components, cycle detection, MST — it’s a Swiss Army knife for graph connectivity.