| implementation | search\(^*\) | insert\(^*\) | delete\(^*\) | search\(^\dagger\) | insert\(^\dagger\) | delete\(^\dagger\) | ordered | ops on keys |
|---|---|---|---|---|---|---|---|---|
| seq search (unordered list) | \(N\) | \(N\) | \(N\) | \(N\) | \(N\) | \(N\) | equals() |
|
| binary search (ordered array) | \(\log N\) | \(N\) | \(N\) | \(\log N\) | \(N\) | \(N\) | X | compareTo() |
| BST | \(N\) | \(N\) | \(N\) | \(\log N\) | \(\log N\) | \(\sqrtN\) | X | compareTo() |
| goal | \(\log N\) | \(\log N\) | \(\log N\) | \(\log N\) | \(\log N\) | \(\log N\) | X | compareTo() |
\(^*\)guarantee, \(^\dagger\)average
Challenge: Guarantee performance
This lecture: 2-3 trees, left-leaning red-black BSTs, B-trees
Allow 1 or 2 keys per node
Symmetric order: Inorder traversal yields keys in ascending order
Perfect balance: Every path from root to null link has same length (how to maintain?)

Search









Insertion into a 2-node at bottom

Insertion into a 3-node at bottom

Invariants: Maintains symmetric order and perfect balance
Pf: Each transformation maintains symmetric order and perfect balance



Splitting a 4-node is a local transformation: constant number of operations


What is the range of heights of a 2-3 tree with \(N\) keys (best / worst case)?
A. \(\texttilde \log_4 N\) / \(\texttilde \log_3 N\)
B. \(\texttilde \log_3 N\) / \(\texttilde \log_2 N\)
C. \(\texttilde \log_3 N\) / \(\texttilde 2 \log_2 N\)
D. \(\texttilde \log_3 N\) / \(\texttilde N\)
E. I don't know
Perfect balance: Every path from root to null link has same length

Tree height:
Bottom line: Guaranteed logarithmic performance for search and insert
| implementation | search\(^*\) | insert\(^*\) | delete\(^*\) | search\(^\dagger\) | insert\(^\dagger\) | delete\(^\dagger\) | ordered | ops on keys |
|---|---|---|---|---|---|---|---|---|
| seq search (unordered list) | \(N\) | \(N\) | \(N\) | \(N\) | \(N\) | \(N\) | equals() |
|
| binary search (ordered array) | \(\log N\) | \(N\) | \(N\) | \(\log N\) | \(N\) | \(N\) | X | compareTo() |
| BST | \(N\) | \(N\) | \(N\) | \(\log N\) | \(\log N\) | \(\sqrtN\) | X | compareTo() |
| 2-3 tree\(^\ddagger\) | \(\log N\) | \(\log N\) | \(\log N\) | \(\log N\) | \(\log N\) | \(\log N\) | X | compareTo() |
\(^*\)guarantee, \(^\dagger\)average
\(^\ddagger\)but hidden constant \(c\) is large (depends upon implementation)
Direct implementation is complicated, because
// fantasy code
public void put(Key key, Value val) {
Node x = root;
while(x.getTheCorrectChild(key) != null) {
x = x.getTheCorrectChildKey();
if(x.is4Node()) x.split();
}
if (x.is2Node()) x.make3Node(key, val);
else if(x.is3Node()) x.make4Node(key, val);
}
Bottom line: Could do it, but there's a better way
Challenge: How to represent a 3-node?

Challenge: How to represent a 3-node?
Approach 1: Regular BST

Challenge: How to represent a 3-node?
Approach 2: Regular BST with red "glue" nodes

Challenge: How to represent a 3-node?
Approach 3: Regular BST with red "glue" links


A 2-3 tree and corresponding red-black BST

Key property: 1-1 correspondence between 2-3 and LLRB

A BST such that
null link has the same number of black links ("perfect black balance")
Observation: Search is the same as for elementary BST (ignore color), but runs faster because of better balance
public Value get(Key key) {
Node x = root;
while(x != null) {
int cmp = key.compareTo(x.key);
if (cmp < 0) x = x.left;
else if(cmp > 0) x = x.right;
else return x.val;
}
return null;
}
|
![]() |
Remark: Most other ops (e.g., floor, iteration, selection) are also identical
Each node is pointed to by precisely one link (from its parent); can encode color of links in nodes
private static final boolean RED = true;
private static final boolean BLACK = false;
private class Node {
Key key;
Value val;
Node left, right;
boolean color; // color of parent link
}
private boolean isRed(Node x) {
if(x == null) return false; // null links are black
return x.color == RED;
}

root.left.color == RED root.right.color == BLACK
Basic strategy: Maintain 1-1 correspondence with 2-3 trees
During internal operations, maintain:

How? Apply elementary red-black BST operations: rotation and color flip
Left rotation: Orient a (temporarily) right-leaning red link to lean left
private node rotateLeft(Node h) {
assert isRed(h.right);
Node x = h.right;
h.right = x.left;
x.left = h;
x.color = h.color;
h.color = RED;
return x;
}
Invariants: Maintains symmetric order and perfect black balance
Left rotation: Orient a (temporarily) right-leaning red link to lean left

Right rotation: Orient a left-leaning red link to (temporarily) lean right
private node rotateRight(Node h) {
assert isRed(h.left);
Node x = h.left;
h.left = x.right;
x.right = h;
x.color = h.color;
h.color = RED;
return x;
}
Invariants: Maintains symmetric order and perfect black balance
Right rotation: Orient a left-leaning red link to (temporarily) lean right

Color flip: Recolor to split a (temporary) 4-node
private void flipColors(Node h) {
assert !isRed(h);
assert isRed(h.left);
assert isRed(h.right);
h.color = RED;
h.left.color = BLACK;
h.right.color = BLACK;
}
Invariants: Maintains symmetric order and perfect black balance
Color flip: Recolor to split a (temporary) 4-node

Warmup 1: Insert into a tree with exactly 1 node

null link of rootA converts 2-node to 3-nodeWarmup 1: Insert into a tree with exactly 1 node

null link of rootB (right-leaning)Case 1: Insert into a 2-node at the bottom

Case 1: Insert into a 2-node at the bottom

Case 1: Insert into a 2-node at the bottom

Warmup 2: Insert into a tree with exactly 2 nodes

null link of rootWarmup 2: Insert into a tree with exactly 2 nodes

null linkWarmup 2: Insert into a tree with exactly 2 nodes

null linkCase 2: Insert into a 3-node at the bottom

R)
S right
R red, so flip colors
R red, so flip colorsE red, so rotate left
R red, so flip colorsE red, so rotate left
Insert E

Insert E

Insert A

Insert A

Insert R

Insert R

Insert C

Insert C

Insert H

Insert H

Insert X

Insert X

Insert M

Insert M

Insert P

Insert P

Insert L

Insert L
Same code for all cases
private Node put(Node h, Key key, Value val) {
if(h == null) {
// insert at bottom and color it red
return new Node(key, val, RED);
}
int cmp = key.compareTo(h.key);
if (cmp < 0) h.left = put(h.left, key, val);
else if(cmp > 0) h.right = put(h.right, key, val);
else h.val = val;
// only a few extra LoC provides near-perfect balance
// lean left
if(isRed(h.right) && !isRed(h.left)) h = rotateLeft(h);
// balance 4-node
if(isRed(h.left) && isRed(h.left.left)) h = rotateRight(h);
// split 4-node
if(isRed(h.left) && isred(h.right)) flipColors(h);
return h;
}
255 insertions in ascending order

255 insertions in descending order

255 random insertions

What is the height of an LLRB tree with \(N\) keys in the worst case?
A. \(\texttilde \log_3 N\)
B. \(\texttilde \log_2 N\)
C. \(\texttilde 2 \log_2 N\)
D. \(\texttilde N\)
E. I don't know
Proposition: Height of tree is \(\leq 2 \log N\) in the worst case
Pf:

Property: Height of tree is \(\texttilde 1.0 \lg N\) in typical applications
| implementation | search\(^*\) | insert\(^*\) | delete\(^*\) | search\(^\dagger\) | insert\(^\dagger\) | delete\(^\dagger\) | ordered | ops on keys |
|---|---|---|---|---|---|---|---|---|
| seq search (unordered list) | \(N\) | \(N\) | \(N\) | \(N\) | \(N\) | \(N\) | equals() |
|
| binary search (ordered array) | \(\log N\) | \(N\) | \(N\) | \(\log N\) | \(N\) | \(N\) | X | compareTo() |
| BST | \(N\) | \(N\) | \(N\) | \(\log N\) | \(\log N\) | \(\sqrtN\) | X | compareTo() |
| 2-3 tree\(^\ddagger\) | \(\log N\) | \(\log N\) | \(\log N\) | \(\log N\) | \(\log N\) | \(\log N\) | X | compareTo() |
| LLRB\(^\star\) | \(\log N\) | \(\log N\) | \(\log N\) | \(\log N\) | \(\log N\) | \(\log N\) | X | compareTo() |
\(^*\)guarantee, \(^\dagger\)average
\(^\ddagger\)hidden constant \(c\) is large (depends upon implementation)
\(^\star\)hidden constant \(c\) is small (at most \(2 \lg N\) compares)
|
Xerox PARC innovations (1970s)
|
![]() ![]() |
Telephone company contracted with database provider to build real-time database to store customer information
Database implementation
Telephone company contracted with database provider to build real-time database to store customer information
Extended telephone service outage
“If implemented properly, the height of a red-black BST with \(N\) keys is at most \(2 \lg N\).
”
—expert witness

Property: time required for a probe is much larger than time to access data within a page
Cost model: number of probes
Goal: access data using minimum number of probes
B-tree: Generalize 2-3 trees by allowing up to \(M\) keys per node



Proposition: A search or an insertion in a B-tree of order \(M\) with \(N\) keys requires between \(\texttilde \log_M N\) and \(\texttilde \log_{M/2} N\) probes.
Pf: All nodes (except possibly root) have between \(\left\lfloor M/2 \right\rfloor\) and \(M\) keys
In practice: Number of probes is at most \(4\) (when \(M=1024\), \(N = 62 \text{ billion}\), then \(\log_{M/2} N \leq 4\))
What of the following does the B in B-tree not mean?
A. Bayer
B. Balanced
C. Binary
D. Boeing
E. I don't know
“the more you think about what the B in B-trees could mean, the more you learn about B-trees and that is good.
”
–Ed McCreight
Red-Black trees are widely used as system symbol tables
java.util.TreeMap, java.util.TreeSetlinux/rbtree.hB-tree cousins: B+ tree, B*tree, B# tree, ...
B-trees (and cousins) are widely used for file systems and DBs
