CS231 Algorithms Prof. Lyn Turbak Wellesley College
Handout # 21 November 2, 2001
Red-Black Trees Reading: CLRS Chapter 13; 14.1 – 14.2; CLR Chapter 14; 15.1 – 15.2. Definition A red-black tree (RBT) is a binary search tree that satisfies the following red-black properties: 1. Every node has a color that is either red or black. 2. Every leaf is black. 3. If a node is red, both children are black. 4. Every path from a given node down to any descendant leaf contains the same number of black nodes. The number of black nodes on such a path (not including the initial node but including leaves) is called the black-height (bh) of the node. 5. The root of the tree is black (not a CLR property, but should be). Example
D
A
B
2
1
C
3
K I
1
G E
H
1
F
N
2
J
2
3
M
1
L
1
1
2
P
1
O
2
1
Q
1
R
1
1
S
1
Balance Property of Red-Black Trees Consider a subtree with n nodes (non-leaves) rooted at any node x within a red-black tree. Then the following relationships hold: • height(x)/2 ≤ bh(x) ≤ height(x) • 2bh(x) − 1 ≤ n < 2height(x) • lg(n) < height(x) ≤ 2 lg(n + 1) The last relationship is the sense in which a red-black tree is balanced. 1
Red-Black Tree Insertion To insert value V into red-black tree T: Step 1: Use usual BST insertion algorithm, coloring new node red. RBT properties (1), (2), and (4) do not change. RBT property (3) will not hold if there is a red-red violation. Step 2: Remove any red-red violation via the following rotation rules.1 It may be necessary to apply the rules multiple times. What is the maximal number of times the rules can be applied in the worst case?
Z d
Y c
X a
b
X Z
Y a
Z
d
X
X
Z d
Y
a
Y
a
b
c
d b
b
c
c X a
Y b
Z c
d
Step 3: If Step 1 or Step 2 leaves the root of the tree red, reassert RBT property (5) by blackening the root. Why is this necessary? (Hint: look at the rules in Step 2.) 1
These are simpler, but less efficient than, those in CLR. They are due to Chris Okasaki, Purely Functional Data Structures, Cambridge University Press, 1998.
2
Red-Black Tree Insertion Example Insert the letters A L G O R I T H M in order into a red-black tree.
A A insert
→
A
blacken
→
A
A
root
G
insert
insert
L
G
→
L
L
→
rotate
blacken
→
A
→
root
L
G
G G A
G insert
A
L
→
A
L
L
insert
rotate
→
O
R
→
O
O R
G
G
G A A
O L
O
A
insert
insert
I
T
→
L
R
O insert
→
L
→ H
R
R I
I
3
T
Red-Black Tree Insertion Example (continued)
G G A
O A L
R
I
O
rotate
rotate
→
I
→
R
T H
L
T
H
I
I
G
O
G
O
blacken
A
H
L
→
root
R
insert
A
H
L
T
G
O H
M
T
I
A
→
R
L
R M
4
T
More Efficient Red-Red Violation Elimination, Part 1 The rotation rules presented earlier for eliminating red-red violations are simple but can require a number of rotations proportional to the height of the tree. CLR present more complex but efficient rules. Here, we present
a
their rules in a different style, using the notation
to stand for a red-black tree named a rooted at a red node
b
and to stand for a red-black tree named b rooted at a black node. Case 1 of CLR’s RB-Insert handles the case where the sibling of the top red node N of red-red violation is red. In this case, the blackness of N ’s parent can be distributed among N and its sibling, as illustrated by the following rules:
Y X
Y
b
a
X
c
c
a
X
b
a
Y
b
c
b
b
Z
b
Z
a c
Note that no rotations are required, but the red-red violation may move up the tree.
5
c
X
a
c
Y
a
X
Z
c
X
a
b
a
Z X
X
b
c
More Efficient Red-Red Violation Elimination, Part 2 Case 2 and Case 3 of CLR’s RB-Insert handle the situation where the sibling of the top red node N of red-red violation is black. In this case, a single rotation2 eliminates the red-red violation, as illustrated by the following rules:
Z Y X
a
d c
b
Z
X Y
X
d
Z
a X
Z
Y
a
Y a
b
b
c
d
d b
c
c
X Y
a
Z
b c
d
This more efficient approach of eliminating red-red violations needs at most Θ(lg(n)) rule applications and one rotation. 2
CLR presents the rules in such a way that two rotations may be required, but this can be optimized to one.
6
Red-Black Tree Deletion To delete value V from red-black tree T: Step 1: Use usual BST deletion algorithm, replacing a deleted node by its predecessor (or successor) in the case where neither child is a leaf. Let N be the node with a child leaf that is deleted. If N is black, property (4) (uniformity of black-height) is violated. Reassert it by making the “other” child of N doubly-black Step 2: Propagate double-blackness up the tree using the following rules.3 The blackness token turns a black node doubly-black and turns a red node black. Using these rules, the black height invariant can be reasserted with Θ(lg(n)) rule applications and at most 2 rotations. • A. The sibling of the doubly-black node is black and one nephew is red (CLR’s RB-Delete Cases 3 and 4). This rule eliminates the blackness token with one rotation.
Y
X X
Z
a
b
Z b
d
Y
X
c
Y
a b
d
Z
a c
c
d
• B. The sibling and both nephews of the doubly-black node are black (CLR’s RB-Delete Case 2). This rule propagates the blackness token upward without rotation.
X
X Y
a b
Y
a c
b
c
• C. The sibling of the doubly-black node is red (CLR’s RB-Delete Case 1). This enables Case A or B.
Y
X Y
a b
X
c
a
c b
Step 3: If the doubly black token progagates to root, remove it. 3
Only the cases where the doubly-black node is a left child are shown; the cases where the doubly-black node is a right child are symmetric.
7
Red-Black Tree Deletion Example Delete the letters A L G O R I T H M in order from the red black tree constructed before.
I
I
G
O
G
O
delete
A
H
L
A
R M
CaseB
→
H
L
T
M
I
T
I
G
O
G
O
CaseB
H
L
unblacken
→
R M
H
L
T
→
root
R M
I
T
I
G
O
G
O
delete
H
→
R
L
L
R M
blacken
→
H
T
M
→
red
R T
8
Red-Black Tree Deletion Example (continued)
I
I
G
O
O
H delete
H
M
blacken
→ G
R
M
→
red
R
T
T
I
I O
H
H
M
delete
replace
O
by pred
→
R
M
→
R
T
T
I I H
I
M CaseA
→
R
R
H
delete
→
H
R
M T
9
T
M
T
Red-Black Tree Deletion Example (continued)
I
replace
→
I M
H
M
H
CaseB
→
by pred
I M
H
blacken
→
red
T
T
T
delete
→ I
H M H
M
M
replace
→
CaseA
→
by pred
T
delete
T
H
T
M delete
H
→
M
M
delete
unblacken
M
root
→
H
10
→
→ T
CaseB
H
→
Augmenting Red-Black Trees Can often improve running time of additional operations on a data structure by caching extra information in the header node or data nodes of a data structure. Must insure that this information can be updated efficiently for other operations. Examples: 1. Store in a header node the length of a linked or doubly-linked list. 2. Store in a header node the maximum value of a linked list representation of a bag. Can also store a pointer to the list node containing the maximum value. Does it matter whether the list is sorted vs. unsorted? Linked vs. doubly-linked? 3. Store the size of every red-black subtree in the root of that subtree. size[leaf] = 0 size[node] = 1 + size[left[node]] + size[right[node]] Can use size field to: • Determine size of tree in Θ(1) worst-case time. • Perform Select(T, k) (i.e. find the kth order statistic) in Θ(lg(n)) worst-case time. • Determine the rank of a given key x in Θ(lg(n)) worst-case time. Insert and Delete can update the size field efficiently (i.e., without changing the asymptotic running time of Insert and Delete): Insert: In downward phase searching for insertion point, increment sizes by one. In upward ”fix-up” phase, update sizes at each rotation. Delete: After deleting node y, decrement sizes on path to root[T] by one. In upward ”fix-up” phase, update sizes at each rotation. In general, can efficienly augment every node x of a red-black tree with the a field that stores the result of any function f that depends only on key[x], f(left[x]), and f(right[x]). • for Insert, field only needs to be updated on path from insertion point to root. • for Delete, only needs to be updated on path from deletion point to root. • easy to update at every rotation.
11