Given a set of \(N\) elements, support two operations:
connect(4, 3) connect(3, 8) connect(6, 5) connect(9, 4) connect(2, 1) isConnected(8, 9) // true isConnected(5, 7) // false connect(5, 0) connect(7, 2) connect(6, 1) connect(1, 0) isConnected(5, 7) // true |
![]() |
connect(4, 3) connect(3, 8) connect(6, 5) connect(9, 4) connect(2, 1) isConnected(8, 9) // true isConnected(5, 7) // false connect(5, 0) connect(7, 2) connect(6, 1) connect(1, 0) isConnected(5, 7) // true |
![]() |
Is there a path connecting cyan and pink elements?

Is there a path connecting cyan and pink elements?

Yes.
Note: finding the path explicitly is a harder problem
Applications involve manipulating elements of all types
When programming, convenient to name elemenst 0 to N-1.
We model "is connected to" as an equivalence relation:
p is connected to pp is connected to q, then q is connected to pp is connected to q and q is connected to r, then p is connected to r|
3 disjoint sets / connected components \[ \{0\}\ \{1,4,5\}\ \{2,3,6,7\} \] |
![]() |
p and q with their unionp?\[\{0\}\ \{1,4,5\}\ \{2,3,6,7\}\quad\Rightarrow\quad\{0\}\ \{1,2,3,4,5,6,7\}\]
find(5) != find(6) union(2, 5) // 3 disjoint sets -> 2 disjoint sets find(5) == find(6)
How to model the dynamic-connectivity problem using union-find?
Maintain disjoint sets that correspond to connected components
union(2, 5)
![]() |
![]() |
Goal: design an efficient union-find data type
public class UF {
UF(int N) // initialize union-find data structure with
// N singleton sets (0 to N-1)
void union(int p, int q) // merge sets containing elements
// p and q
int find(int p) // identifier for set containing
// element p (0 to N-1)
}
public static void main(String[] args) {
int N = StdIn.readInt();
UF uf = new UF(N);
while(!StdIn.isEmpty()) {
int p = StdIn.readInt();
int q = StdIn.readInt();
if(uf.find(p) != uf.find(q)) {
uf.union(p, q);
StdOut.println(p + " " + q);
}
}
}
Note with input below, lines 8, 12, and 13 are already connected and therefore will not print.
% more tinyUF.txt 10 4 3 3 8 6 5 9 4 2 1 8 9 5 0 7 2 6 1 1 0 6 7
Data Structure
id[] of length Nid[p] identifies the set containing element p\[ \{0,5,6\}\ \{1,2,7\}\ \{3,4,8,9\} \]
// 0 1 2 3 4 5 6 7 8 9
int [] id = {0,1,1,8,8,0,0,1,8,8};
// find(5) == 0
Q: How to implement find(p)?
Data Structure
id[] of length Nid[p] identifies the set containing element p\[ \{0,5,6\}\ \{1,2,7\}\ \{3,4,8,9\} \]
// 0 1 2 3 4 5 6 7 8 9
int [] id = {0,1,1,8,8,0,0,1,8,8};
// find(5) == 0
Q: How to implement find(p)?
A: Easy, just return id[p]
Data Structure
id[] of length Nid[p] identifies the set containing element p\[ \{0,5,6\}\ \{1,2,7\}\ \{3,4,8,9\} \Rightarrow \{0,1,2,5,6,7\}\ \{3,4,8,9\} \]
// 0 1 2 3 4 5 6 7 8 9
int [] id = {0,1,1,8,8,0,0,1,8,8};
union(6,1);
// id = ??
Q: How to implement union(p,q)?
Data Structure
id[] of length Nid[p] identifies the set containing element p\[ \{0,5,6\}\ \{1,2,7\}\ \{3,4,8,9\} \Rightarrow \{0,1,2,5,6,7\}\ \{3,4,8,9\} \]
// 0 1 2 3 4 5 6 7 8 9
int [] id = {0,1,1,8,8,0,0,1,8,8};
union(6,1);
// id = ??
Q: How to implement union(p,q)?
A: Change all entries whose identifier equals id[p] to id[q].
id = {1,1,1,8,8,1,1,1,8,8}
public class QuickFindUF {
private int[] id;
public QuickFindUF(int N) {
// set id of each element to itself (N array accesses)
id = new int[N];
for(int i = 0; i < N; i++)
id[i] = i;
}
public int find(int p) {
// return the id of p (1 array access)
return id[p];
}
public void union(int p, int q) {
// change all entries with id[p] to id[q]
// (N+2 to 2N+2 array accesses)
int pid = id[p];
int qid = id[q];
for(int i = 0; i < id.length; i++) {
if(id[i] == pid) id[i] = qid;
}
}
}
| algorithm | initialize | union | find |
|---|---|---|---|
| quick-find | \(N\) | \(N\) | \(1\) |
Note: ignoring leading constant
Union is too expensive! Processing a sequence of \(N\) union operations on \(N\) elements takes more than \(N^2\) (quadratic) array accesses.
Rough standard (for now)
Ex. Huge problem for quick-find
Quadratic algorithms don't scale with technology

Data Structure
parent[] of length N, where parent[i] is parent of i in tree\[ \{0\}\ \{1\}\ \{2,3,4,9\}\ \{5,6\}\ \{7\}\ \{8\} \]

// 0 1 2 3 4 5 6 7 8 9
int [] id = {0,1,9,4,9,6,6,7,8,9};
// parent of 3 is 4, parent of 4 is 9, parent of 9 is 9
// root of 3 is 9
Q: How to implement find(p)?
\[ \{0\}\ \{1\}\ \{2,3,4,9\}\ \{5,6\}\ \{7\}\ \{8\} \]

// 0 1 2 3 4 5 6 7 8 9
int [] id = {0,1,9,4,9,6,6,7,8,9};
// parent of 3 is 4, parent of 4 is 9, parent of 9 is 9
// root of 3 is 9
Q: How to implement find(p)?
A: Return root of tree containing p
\[ \ldots \{2,3,4,9\} \{5,6\} \ldots \Rightarrow \ldots \{2,3,4,5,6,9\} \ldots \]

// 0 1 2 3 4 5 6 7 8 9
int [] id = {0,1,9,4,9,6,6,7,8,9};
union(3, 5)
// id = ???
Q: How to implement union(p,q)?
\[ \ldots \{2,3,4,9\} \{5,6\} \ldots \Rightarrow \ldots \{2,3,4,5,6,9\} \ldots \]

// 0 1 2 3 4 5 6 7 8 9
int [] id = {0,1,9,4,9,6,6,7,8,9};
union(3, 5)
// id = ???
Q: How to implement union(p,q)?
A: Set parent of p's root to parent of q's root.
\[ \ldots \{2,3,4,9\} \{5,6\} \ldots \Rightarrow \ldots \{2,3,4,5,6,9\} \ldots \]

// 0 1 2 3 4 5 6 7 8 9
int [] id = {0,1,9,4,9,6,6,7,8,9};
union(3, 5)
// id = {0,1,9,4,9,6,6,7,8,6}
// ^ only one value changes!
union(4,3) union(3,8) union(6,5) union(9,4) union(2,1) isConnected(8,9) !isConnected(5,4) union(5,0) union(7,2) union(6,1) union(7,3)
int [] id = {0,1,2,3,4,5,6,7,8,9};
union(4,3); // <- next step

union(4,3); // 0 1 2 3 4 5 6 7 8 9 => 0 1 2 3 3 5 6 7 8 9 union(3,8); // <- next step

union(3,8); // 0 1 2 3 3 5 6 7 8 9 => 0 1 2 8 3 5 6 7 8 9 union(6,5); // <- next step

union(6,5); // 0 1 2 8 3 5 6 7 8 9 => 0 1 2 8 3 5 5 7 8 9 union(9,4); // <- next step

union(9,4); // 0 1 2 8 3 5 5 7 8 9 => 0 1 2 8 3 5 5 7 8 8 union(2,1); // <- next step

union(2,1); // 0 1 2 8 3 5 5 7 8 8 => 0 1 1 8 3 5 5 7 8 8 union(5,0); // <- next step

union(5,0); // 0 1 1 8 3 5 5 7 8 8 => 0 1 1 8 3 0 5 7 8 8 union(7,2); // <- next step

union(7,2); // 0 1 1 8 3 0 5 7 8 8 => 0 1 1 8 3 0 5 1 8 8 union(6,1); // <- next step

union(6,1); // 0 1 1 8 3 0 5 1 8 8 => 1 1 1 8 3 0 5 1 8 8 union(7,3); // <- next step

union(7,3); // 1 1 1 8 3 0 5 1 8 8 => 1 8 1 8 3 0 5 1 8 8 // all done!

public class QuickUnionUF {
private int[] parent;
public QuickUnionUF(int N) {
// set parent of each element to itself
// N array accesses
parent = new int[N];
for(int i = 0; i < N; i++)
parent[i] = i;
}
public int find(int p) {
// chase parent pointers until reach root
// depth of p array accesses
while(p != parent[p])
p = parent[p];
return p;
}
public void union(int p, int q) {
// change root of p to point to root of q
// depth of p and q array accesses
int i = find(p);
int j = find(q);
parent[i] = j;
}
}
| algorithm | initialize | union | find |
|---|---|---|---|
| quick-find | \(N\) | \(N\) | \(1\) |
| quick-union | \(N\) | \(N^\dagger\) | \(N\) |
\(\dagger\) includes cost of finding two roots
Note: analyzed quick-union for worst case
|
Quick-find defect
Quick-union defect
// worst-case input union(0,1); union(0,2); union(0,3); union(0,4); |
![]() |
Weighted quick-union

Suppose that the parent[] array during weighted quick union is
// 0 1 2 3 4 5 6 7 8 9
int [] parent = {0,0,0,0,0,0,7,8,8,8};

Which parent[] entry changes during union(2,6)?
A. parent[0]
B. parent[2]
C. parent[6]
D. parent[8]
union(4,3) union(3,8) union(6,5) union(9,4) union(2,1) union(5,0) union(7,2) union(6,1) union(7,3)
int [] id = {0,1,2,3,4,5,6,7,8,9};
union(4,3); // <- next step

union(4,3); // 0 1 2 3 4 5 6 7 8 9 => 0 1 2 4 4 5 6 7 8 9 union(3,8); // <- next step

union(3,8); // 0 1 2 4 4 5 6 7 8 9 => 0 1 2 4 4 5 6 7 4 9 union(6,5); // <- next step

union(6,5); // 0 1 2 4 4 5 6 7 4 9 => 0 1 2 4 4 6 6 7 4 9 union(9,4); // <- next step

union(9,4); // 0 1 2 4 4 6 6 7 4 9 => 0 1 2 4 4 6 6 7 4 4 union(2,1); // <- next step

union(2,1); // 0 1 2 4 4 6 6 7 4 4 => 0 2 2 4 4 6 6 7 4 4 union(5,0); // <- next step

union(5,0); // 0 2 2 4 4 6 6 7 4 4 => 6 2 2 4 4 6 6 7 4 4 union(7,2); // <- next step

union(7,2); // 6 2 2 4 4 6 6 7 4 4 => 6 2 2 4 4 6 6 2 4 4 union(6,1); // <- next step

union(6,1); // 6 2 2 4 4 6 6 2 4 4 => 6 2 6 4 4 6 6 2 4 4 union(7,3); // <- next step

union(7,3); // 6 2 6 4 4 6 6 2 4 4 => 6 2 6 4 6 6 6 2 4 4 // all done!

|
quick-union
|
weighted quick-union
|
A larger example: 100 sites, 88 union() operations
quick-union, average distance to root = 5.11

weighted quick-union, average distance to root: 1.52

Data structure: same as quick-union, but maintain extra array size[i] to count number of elements in the tree rooted at i, initially set to 1.
Find: identical to quick-union
Union: modify quick-union to:
size[] arrayint i = find(p);
int j = find(q);
if(i == j) return;
if(size[i] < size[j]) { parent[i] = j; size[j] += size[i]; }
else { parent[j] = i; size[i] += size[j]; }
Running time
pProposition: depth of any node \(\textsf{x}\) is at most \(\lg N\) (in computer science, \(\lg\) means base-2 logarithm)
![]() |
\[N = 10\] \[\text{depth}(\textsf{x}) \leq \lg N \approx 3.32\] |
Proposition: depth of any node \(\textsf{x}\) is at most \(\lg N\) (in computer science, \(\lg\) means base-2 logarithm)
Proof: What causes the depth of element \(\textsf{x}\) to increase? Increase by 1 when root of tree \(\textsf{T1}\) containing \(\textsf{x}\) is linked to root of tree \(\textsf{T2}\).

| algorithm | initialize | union | find |
|---|---|---|---|
| quick-find | \(N\) | \(N\) | \(1\) |
| quick-union | \(N\) | \(N^\dagger\) | \(N\) |
| weighted QU | \(N\) | \(\lg N^\dagger\) | \(\lg N\) |
\(\dagger\) includes cost of finding two roots
Note: analyzed quick-union for worst case
Key point: weighted quick-union makes it possible to solve problems that could not otherwise be addressed.
| algorithm | worst-case time |
|---|---|
| quick-find | \(M N\) |
| quick-union | \(M N\) |
| weighted QU | \(N + M \log N\) |
| QU + path compression | \(N + M \log N\) |
| weighted QU + path compression | \(N + M \lg^* N\) |
Order of growth for \(M\) union-find operations on a set of \(N\) elements
Example: \(10^9\) unions and finds with \(10^9\) elements
bwlabel() function in image processingThe game of Hex is played on a diamond-shaped board of hexagons. Two players alternate turns by placing their colored stones (red/blue, white/black, etc.) on the board, attempting to make a connection between their respective opposite sides.

Q: How to determine if a player has won?
A: Model as a dynamic-connectivity problem and use union-find

![]() |
![]() |
Create a node for each hexagon tile, named \(0\) to \(N^2-1\)
![]() |
![]() |
Color the node of the player to represent placing a stone
![]() |
![]() |
Color the node of the player to represent placing a stone
![]() |
![]() |
Add edge between two adjacent nodes if they are similarly colored
Note: could add up to 6 edges
![]() |
![]() |
A player wins when there is a path between their opposite sides of the board from top–bottom or left–right
Example: check each node at top against each node at bottom
![]() |
![]() |
How can we check this more efficiently?
![]() |
![]() |
Clever trick: introduce 4 virtual nodes, edges where appropriate
A player wins when there is a path between opposite virtual nodes
|
Steps to developing a usable algorithm to solve a computational problem
|
![]() |
This is the scientific method
Mathematical analysis
