Given a set of \(N\) elements, support two operations:
connect(4, 3) connect(3, 8) connect(6, 5) connect(9, 4) connect(2, 1) isConnected(8, 9) // true isConnected(5, 7) // false connect(5, 0) connect(7, 2) connect(6, 1) connect(1, 0) isConnected(5, 7) // true |
connect(4, 3) connect(3, 8) connect(6, 5) connect(9, 4) connect(2, 1) isConnected(8, 9) // true isConnected(5, 7) // false connect(5, 0) connect(7, 2) connect(6, 1) connect(1, 0) isConnected(5, 7) // true |
Is there a path connecting cyan and pink elements?
Is there a path connecting cyan and pink elements?
Yes.
Note: finding the path explicitly is a harder problem
Applications involve manipulating elements of all types
When programming, convenient to name elemenst 0
to N-1
.
We model "is connected to" as an equivalence relation:
p
is connected to p
p
is connected to q
, then q
is connected to p
p
is connected to q
and q
is connected to r
, then p
is connected to r
3 disjoint sets / connected components \[ \{0\}\ \{1,4,5\}\ \{2,3,6,7\} \] |
p
and q
with their unionp
?\[\{0\}\ \{1,4,5\}\ \{2,3,6,7\}\quad\Rightarrow\quad\{0\}\ \{1,2,3,4,5,6,7\}\]
find(5) != find(6) union(2, 5) // 3 disjoint sets -> 2 disjoint sets find(5) == find(6)
How to model the dynamic-connectivity problem using union-find?
Maintain disjoint sets that correspond to connected components
union(2, 5)
Goal: design an efficient union-find data type
public class UF { UF(int N) // initialize union-find data structure with // N singleton sets (0 to N-1) void union(int p, int q) // merge sets containing elements // p and q int find(int p) // identifier for set containing // element p (0 to N-1) }
public static void main(String[] args) { int N = StdIn.readInt(); UF uf = new UF(N); while(!StdIn.isEmpty()) { int p = StdIn.readInt(); int q = StdIn.readInt(); if(uf.find(p) != uf.find(q)) { uf.union(p, q); StdOut.println(p + " " + q); } } }
Note with input below, lines 8, 12, and 13 are already connected and therefore will not print.
% more tinyUF.txt 10 4 3 3 8 6 5 9 4 2 1 8 9 5 0 7 2 6 1 1 0 6 7
Data Structure
id[]
of length N
id[p]
identifies the set containing element p
\[ \{0,5,6\}\ \{1,2,7\}\ \{3,4,8,9\} \]
// 0 1 2 3 4 5 6 7 8 9 int [] id = {0,1,1,8,8,0,0,1,8,8}; // find(5) == 0
Q: How to implement find(p)
?
Data Structure
id[]
of length N
id[p]
identifies the set containing element p
\[ \{0,5,6\}\ \{1,2,7\}\ \{3,4,8,9\} \]
// 0 1 2 3 4 5 6 7 8 9 int [] id = {0,1,1,8,8,0,0,1,8,8}; // find(5) == 0
Q: How to implement find(p)
?
A: Easy, just return id[p]
Data Structure
id[]
of length N
id[p]
identifies the set containing element p
\[ \{0,5,6\}\ \{1,2,7\}\ \{3,4,8,9\} \Rightarrow \{0,1,2,5,6,7\}\ \{3,4,8,9\} \]
// 0 1 2 3 4 5 6 7 8 9 int [] id = {0,1,1,8,8,0,0,1,8,8}; union(6,1); // id = ??
Q: How to implement union(p,q)
?
Data Structure
id[]
of length N
id[p]
identifies the set containing element p
\[ \{0,5,6\}\ \{1,2,7\}\ \{3,4,8,9\} \Rightarrow \{0,1,2,5,6,7\}\ \{3,4,8,9\} \]
// 0 1 2 3 4 5 6 7 8 9 int [] id = {0,1,1,8,8,0,0,1,8,8}; union(6,1); // id = ??
Q: How to implement union(p,q)
?
A: Change all entries whose identifier equals id[p]
to id[q]
.
id = {1,1,1,8,8,1,1,1,8,8}
public class QuickFindUF { private int[] id; public QuickFindUF(int N) { // set id of each element to itself (N array accesses) id = new int[N]; for(int i = 0; i < N; i++) id[i] = i; } public int find(int p) { // return the id of p (1 array access) return id[p]; } public void union(int p, int q) { // change all entries with id[p] to id[q] // (N+2 to 2N+2 array accesses) int pid = id[p]; int qid = id[q]; for(int i = 0; i < id.length; i++) { if(id[i] == pid) id[i] = qid; } } }
algorithm | initialize | union | find |
---|---|---|---|
quick-find | \(N\) | \(N\) | \(1\) |
Note: ignoring leading constant
Union is too expensive! Processing a sequence of \(N\) union operations on \(N\) elements takes more than \(N^2\) (quadratic) array accesses.
Rough standard (for now)
Ex. Huge problem for quick-find
Quadratic algorithms don't scale with technology
Data Structure
parent[]
of length N
, where parent[i]
is parent of i
in tree\[ \{0\}\ \{1\}\ \{2,3,4,9\}\ \{5,6\}\ \{7\}\ \{8\} \]
// 0 1 2 3 4 5 6 7 8 9 int [] id = {0,1,9,4,9,6,6,7,8,9}; // parent of 3 is 4, parent of 4 is 9, parent of 9 is 9 // root of 3 is 9
Q: How to implement find(p)
?
\[ \{0\}\ \{1\}\ \{2,3,4,9\}\ \{5,6\}\ \{7\}\ \{8\} \]
// 0 1 2 3 4 5 6 7 8 9 int [] id = {0,1,9,4,9,6,6,7,8,9}; // parent of 3 is 4, parent of 4 is 9, parent of 9 is 9 // root of 3 is 9
Q: How to implement find(p)
?
A: Return root of tree containing p
\[ \ldots \{2,3,4,9\} \{5,6\} \ldots \Rightarrow \ldots \{2,3,4,5,6,9\} \ldots \]
// 0 1 2 3 4 5 6 7 8 9 int [] id = {0,1,9,4,9,6,6,7,8,9}; union(3, 5) // id = ???
Q: How to implement union(p,q)
?
\[ \ldots \{2,3,4,9\} \{5,6\} \ldots \Rightarrow \ldots \{2,3,4,5,6,9\} \ldots \]
// 0 1 2 3 4 5 6 7 8 9 int [] id = {0,1,9,4,9,6,6,7,8,9}; union(3, 5) // id = ???
Q: How to implement union(p,q)
?
A: Set parent of p
's root to parent of q
's root.
\[ \ldots \{2,3,4,9\} \{5,6\} \ldots \Rightarrow \ldots \{2,3,4,5,6,9\} \ldots \]
// 0 1 2 3 4 5 6 7 8 9 int [] id = {0,1,9,4,9,6,6,7,8,9}; union(3, 5) // id = {0,1,9,4,9,6,6,7,8,6} // ^ only one value changes!
union(4,3) union(3,8) union(6,5) union(9,4) union(2,1) isConnected(8,9) !isConnected(5,4) union(5,0) union(7,2) union(6,1) union(7,3)
int [] id = {0,1,2,3,4,5,6,7,8,9}; union(4,3); // <- next step
union(4,3); // 0 1 2 3 4 5 6 7 8 9 => 0 1 2 3 3 5 6 7 8 9 union(3,8); // <- next step
union(3,8); // 0 1 2 3 3 5 6 7 8 9 => 0 1 2 8 3 5 6 7 8 9 union(6,5); // <- next step
union(6,5); // 0 1 2 8 3 5 6 7 8 9 => 0 1 2 8 3 5 5 7 8 9 union(9,4); // <- next step
union(9,4); // 0 1 2 8 3 5 5 7 8 9 => 0 1 2 8 3 5 5 7 8 8 union(2,1); // <- next step
union(2,1); // 0 1 2 8 3 5 5 7 8 8 => 0 1 1 8 3 5 5 7 8 8 union(5,0); // <- next step
union(5,0); // 0 1 1 8 3 5 5 7 8 8 => 0 1 1 8 3 0 5 7 8 8 union(7,2); // <- next step
union(7,2); // 0 1 1 8 3 0 5 7 8 8 => 0 1 1 8 3 0 5 1 8 8 union(6,1); // <- next step
union(6,1); // 0 1 1 8 3 0 5 1 8 8 => 1 1 1 8 3 0 5 1 8 8 union(7,3); // <- next step
union(7,3); // 1 1 1 8 3 0 5 1 8 8 => 1 8 1 8 3 0 5 1 8 8 // all done!
public class QuickUnionUF { private int[] parent; public QuickUnionUF(int N) { // set parent of each element to itself // N array accesses parent = new int[N]; for(int i = 0; i < N; i++) parent[i] = i; } public int find(int p) { // chase parent pointers until reach root // depth of p array accesses while(p != parent[p]) p = parent[p]; return p; } public void union(int p, int q) { // change root of p to point to root of q // depth of p and q array accesses int i = find(p); int j = find(q); parent[i] = j; } }
algorithm | initialize | union | find |
---|---|---|---|
quick-find | \(N\) | \(N\) | \(1\) |
quick-union | \(N\) | \(N^\dagger\) | \(N\) |
\(\dagger\) includes cost of finding two roots
Note: analyzed quick-union for worst case
Quick-find defect
Quick-union defect
// worst-case input union(0,1); union(0,2); union(0,3); union(0,4); |
Weighted quick-union
Suppose that the parent[]
array during weighted quick union is
// 0 1 2 3 4 5 6 7 8 9 int [] parent = {0,0,0,0,0,0,7,8,8,8};
Which parent[]
entry changes during union(2,6)
?
A. parent[0]
B. parent[2]
C. parent[6]
D. parent[8]
union(4,3) union(3,8) union(6,5) union(9,4) union(2,1) union(5,0) union(7,2) union(6,1) union(7,3)
int [] id = {0,1,2,3,4,5,6,7,8,9}; union(4,3); // <- next step
union(4,3); // 0 1 2 3 4 5 6 7 8 9 => 0 1 2 4 4 5 6 7 8 9 union(3,8); // <- next step
union(3,8); // 0 1 2 4 4 5 6 7 8 9 => 0 1 2 4 4 5 6 7 4 9 union(6,5); // <- next step
union(6,5); // 0 1 2 4 4 5 6 7 4 9 => 0 1 2 4 4 6 6 7 4 9 union(9,4); // <- next step
union(9,4); // 0 1 2 4 4 6 6 7 4 9 => 0 1 2 4 4 6 6 7 4 4 union(2,1); // <- next step
union(2,1); // 0 1 2 4 4 6 6 7 4 4 => 0 2 2 4 4 6 6 7 4 4 union(5,0); // <- next step
union(5,0); // 0 2 2 4 4 6 6 7 4 4 => 6 2 2 4 4 6 6 7 4 4 union(7,2); // <- next step
union(7,2); // 6 2 2 4 4 6 6 7 4 4 => 6 2 2 4 4 6 6 2 4 4 union(6,1); // <- next step
union(6,1); // 6 2 2 4 4 6 6 2 4 4 => 6 2 6 4 4 6 6 2 4 4 union(7,3); // <- next step
union(7,3); // 6 2 6 4 4 6 6 2 4 4 => 6 2 6 4 6 6 6 2 4 4 // all done!
quick-union |
weighted quick-union |
A larger example: 100 sites, 88 union()
operations
quick-union, average distance to root = 5.11
weighted quick-union, average distance to root: 1.52
Data structure: same as quick-union, but maintain extra array size[i]
to count number of elements in the tree rooted at i
, initially set to 1
.
Find: identical to quick-union
Union: modify quick-union to:
size[]
arrayint i = find(p); int j = find(q); if(i == j) return; if(size[i] < size[j]) { parent[i] = j; size[j] += size[i]; } else { parent[j] = i; size[i] += size[j]; }
Running time
p
Proposition: depth of any node \(\textsf{x}\) is at most \(\lg N\) (in computer science, \(\lg\) means base-2 logarithm)
\[N = 10\] \[\text{depth}(\textsf{x}) \leq \lg N \approx 3.32\] |
Proposition: depth of any node \(\textsf{x}\) is at most \(\lg N\) (in computer science, \(\lg\) means base-2 logarithm)
Proof: What causes the depth of element \(\textsf{x}\) to increase? Increase by 1 when root of tree \(\textsf{T1}\) containing \(\textsf{x}\) is linked to root of tree \(\textsf{T2}\).
algorithm | initialize | union | find |
---|---|---|---|
quick-find | \(N\) | \(N\) | \(1\) |
quick-union | \(N\) | \(N^\dagger\) | \(N\) |
weighted QU | \(N\) | \(\lg N^\dagger\) | \(\lg N\) |
\(\dagger\) includes cost of finding two roots
Note: analyzed quick-union for worst case
Key point: weighted quick-union makes it possible to solve problems that could not otherwise be addressed.
algorithm | worst-case time |
---|---|
quick-find | \(M N\) |
quick-union | \(M N\) |
weighted QU | \(N + M \log N\) |
QU + path compression | \(N + M \log N\) |
weighted QU + path compression | \(N + M \lg^* N\) |
Order of growth for \(M\) union-find operations on a set of \(N\) elements
Example: \(10^9\) unions and finds with \(10^9\) elements
bwlabel()
function in image processingThe game of Hex is played on a diamond-shaped board of hexagons. Two players alternate turns by placing their colored stones (red/blue, white/black, etc.) on the board, attempting to make a connection between their respective opposite sides.
Q: How to determine if a player has won?
A: Model as a dynamic-connectivity problem and use union-find
|
|
Create a node for each hexagon tile, named \(0\) to \(N^2-1\)
|
|
Color the node of the player to represent placing a stone
|
|
Color the node of the player to represent placing a stone
|
|
Add edge between two adjacent nodes if they are similarly colored
Note: could add up to 6 edges
|
|
A player wins when there is a path between their opposite sides of the board from top–bottom or left–right
Example: check each node at top against each node at bottom
|
|
How can we check this more efficiently?
|
|
Clever trick: introduce 4 virtual nodes, edges where appropriate
A player wins when there is a path between opposite virtual nodes
Steps to developing a usable algorithm to solve a computational problem
|
|
This is the scientific method
Mathematical analysis