The Union-Find data structure, also known as Disjoint Set Union (DSU), is a foundational structure in computer science algorithms used to efficiently manage and query the connectivity of elements divided into disjoint sets. A common challenge in naive implementations of Union-Find is the rapid growth of tree height, leading to costly operations.

This article focuses on the Union by Rank technique, a powerful strategy to keep the Union-Find trees balanced by linking the smaller tree under the root of the larger tree, effectively optimizing the union operation. Understanding and implementing union by rank can drastically speed up algorithms that rely heavily on Union-Find, such as Kruskal’s Minimum Spanning Tree algorithm, network connectivity checks, and more.

What Is Union by Rank?

Union by Rank is an optimization technique for the union operation in a Union-Find data structure. Instead of arbitrarily attaching one tree under another during union, each tree maintains a rank (an approximate measure of its height). The tree with a smaller rank is always attached below the root of the tree with a larger rank, thus maintaining a shallow overall tree.

This careful attachment prevents degeneration into a tall chain that would slow down find and union operations. The rank is updated only when two trees of the same rank are united, incrementing the rank of the new root by one.

Why Use Union by Rank?

  • Improved Efficiency: By preventing tall trees, find operations approach an average time complexity near O(α(n)) (inverse Ackermann function), which is practically constant.
  • Reduced Depth: Balanced trees have minimized depth, translating into fewer steps to find the root.
  • Compatibility with Path Compression: Union by Rank combined with path compression yields one of the fastest known Union-Find implementations.

How Does Union by Rank Work? — Step by Step

Imagine two disjoint sets represented as trees. Each has a root node, with an associated rank representing the tree’s height (or an upper bound on it). The union operation merges these sets.

  1. Find the roots of both elements’ sets using find() operation.
  2. Compare their ranks.
  3. Attach the root with the lower rank as a child of the root with the higher rank.
  4. If ranks are equal, attach one to the other and increase the new root’s rank by 1.

This diagram illustrates how a node with rank 1 is merged under a tree with rank 2, keeping the overall tree balanced.

Union by Rank Algorithm Pseudocode

function union(x, y):
  rootX = find(x)
  rootY = find(y)

  if rootX == rootY:
    return  # already in the same set

  if rank[rootX] > rank[rootY]:
    parent[rootY] = rootX
  else if rank[rootX] < rank[rootY]:
    parent[rootX] = rootY
  else:
    parent[rootY] = rootX
    rank[rootX] += 1
  

Note: The rank array holds the approximate depth of each root node.

Explicit Example with Visual Output

Suppose we have elements 1, 2, 3, 4, and 5, starting as individual sets. Perform union operations and observe the rank adjustments and tree structure.

  • union(1, 2) — Both have rank 0, so attach 2 under 1 and increase rank of 1 to 1.
  • union(3, 4) — Both same rank 0, attach 4 under 3; rank of 3 is now 1.
  • union(1, 3) — Both roots have rank 1, attach 3 under 1 and increase rank of 1 to 2.
  • union(5, 1) — Rank(5) is 0, Rank(1) is 2, so attach 5 under 1.

This final tree shows a balanced structure with root 1 and its children maintaining minimized height.

Interactive Union by Rank Visualization (Concept)

For an interactive experience, imagine a system where elements can be clicked to union, and the visual tree updates dynamically to show ranked roots and height changes live. This helps in grasping how union by rank prevents tall trees.

Combined Optimization: Union by Rank and Path Compression

Union by rank is often combined with path compression. Path compression flattens the tree structure whenever find() is called by making all nodes on the path point directly to the root.

Combined, they yield almost amortized constant time operations, a huge improvement over naive methods.

Union by Rank: Balance Union-Find Trees Efficiently for Optimal Performance

Summary

Union by Rank is a critical technique for efficiently managing disjoint sets by keeping trees balanced during union operations. It limits tree height by always attaching smaller ranked trees under higher ranked ones, improving the runtime complexity of unions and finds. For best performance, pair it with path compression.

This optimization is essential for performance-sensitive algorithms in networking, clustering, and graph theory.