15-451 Algorithms: Lecture 2/15/00 Topics: I. Red-black trees (cont) II. Skip-Lists III. Tries IV. Splay trees ********** I. Red-black trees Recap: Binary search tree with the following properties: 1. Every node is either red or black. 2. The root and every external node is black. 3. If a node is red, both its children are black. 4. Every path from a node to a descendant external node contains the same number of black nodes. Theorem: A red-black tree with n key-nodes has height h <= 2 lg(n+1) Tree rotation: x y / \ left-rotate(x) ---> / \ T1 y <--- right-rotate(y) x T3 / \ / \ T2 T3 T1 T2 Insert(node x): Idea: - Insert x into tree, color R (Only property (3) might be violated (if x's parent is R)) - Move violation up tree -- while maintaining property 4 -- until find a place it can be fixed by rotation. - Time will be O(lg n). 1. Insert x into tree, color R 2. while x's parent p[x] is R: 3. if p[x] is a left child 4. let y be sibling of p[x] 5. if y is R // Case I: 6. color p[x] black, y black, p[p[x]] red 7. x <- p[p[x]] 8. else 9. if x is a right child // Case II: 10. x <- p[x], left-rotate(x) // Case III: 11. color p[x] black, p[p[x]] red, right-rotate(p[p[x]]) 12. else (same as 4-11, with right and left exchanged) endwhile 13. color root black ********** Case I: Cb After: Cr=x / \ Ar Dr=y Ab, Db / \ / \ T1 Br=x T4 T5 / \ T2 T3 T1-T5 are equal-black-height subtrees with black roots Push C's black down to its children Property 4 still holds: Just moved the black down one level Case II: Cb Cb / \ / \ Ar T4 root=y Br T4 root=y / \ / \ T1 Br=x Ar=x T3 / \ / \ T2 T3 T1 T2 T1-T4 are equal-black-height subtrees with black roots If x is a right child, switch to a left child. Proceed to case III Case III: tree on the right above becomes Bb / \ Ar=x Cr / \ / \ T1 T2 T3 T4 Rotate to improve tree's balance. Preserve property (4) by retaining a single black on paths through this portion of the tree. Because root is black, we are done ********** Analysis: - Go up tree performing case 1, which only changes colors. - When case 2 occurs perform 2 rotations and stop - When case 3 occurs perform 1 rotation and stop Thus O(lg n) time with O(1) rotations. [Note: AVL trees may need lg n rotations for insertion.] ********** II. Skip Lists [Not in book] Put randomization to work for search trees. Assume all keys are distinct. Search tree for a dynamic set S. Would like to do: - output min, max, sort of S - given key k: search(k), insert(k) - given ptr to a key-node x: output successor(x) or predecessor(x), or delete(x) Start with a doubly linked list of all the keys in S in sorted order. head-> [3]<->[9]<->[12]<->[18]<->[29]<->[35]<->[37] <-tail What are the time bounds for each of the above operations? - search=Theta(n). - insert=O(1)+search=Theta(n) - min,max,succ,pred,delete=Theta(1) ********** How might we improve the search time? - build binary tree on top, e.g., red-black tree - search time is O(depth of tree) = O(lg n) - must tree be binary to obtain O(log n)? no - what if its constant degree d? takes O(d) time to determine which child contains the value. If roughly balanced, then O(log_d n) depth. so total O(d log_d n) = O(log n). - what if its only Expected constant degree? seems ok, will need to analyze - how might we use randomness to construct this tree? ********** Suppose we build a tree level by level. For each key, we flip a coin and include it in another linked list with probability p=1/2: [9]<--------------->[29]<-------->[37] head-> [3]<->[9]<->[12]<->[18]<->[29]<->[35]<->[37] <-tail coins T H T T H T H What have we gained? Can now make jumps while searching. Repeat again: [9]<----------------------------->[37] [9]<--------------->[29]<-------->[37] head-> [3]<->[9]<->[12]<->[18]<->[29]<->[35]<->[37] <-tail coins H T H coins T H T T H T H And repeat again: top-> [9]<----------------------------->[37] [9]<--------------->[29]<-------->[37] head-> [3]<->[9]<->[12]<->[18]<->[29]<->[35]<->[37] <-tail coins T T coins H T H coins T H T T H T H What have we gained? Can do fast searches 1. Find location in top list 2. Recurse on sublist at next level down E.g., search(18), search(35) Expected time for search(k)? - At each level, compare k with the keys in a sublist of expected O(1) size - How many levels? for HW3, will show Theta(lg n) levels with high probability - Total: O(lg n) expected time for search ********** Time for insert(k)? search(k) plus - add to bottom level: O(1) time - flip coin to see if add to next level E.g., insert(14) top-> [9]<------------------------------------>[37] [9]<---------------------->[29]<-------->[37] head-> [3]<->[9]<->[12]<->[14]<->[18]<->[29]<->[35]<->[37] <-tail With probability 1/2, add to next level up: top-> [9]<------------------------------------>[37] [9]<-------->[14]<-------->[29]<-------->[37] head-> [3]<->[9]<->[12]<->[14]<->[18]<->[29]<->[35]<->[37] <-tail And so on... Expected number of levels? 2 Expected time for insert(k)? O(lg n) Time for min, max, sort, successor, predecessor, delete? - Still O(1) worst case time, except for delete - delete is O(1) expected time. Why? delete all copies, expected number of copies is 2. ********** III. Tries Suppose we have a set S of character strings. E.g., S = {ape, application, apple, ant, applet, book, black} We want to do search and insert, but also prefix-search: e.g., return all strings starting with "ap" - |prefix search| = how many strings start with "ap"? What if use binary search tree? Attempt to insert "ap", then get successors until first that no longer starts with "ap" - O(lg n + |prefix-search|) string comparisons How can ideas from radix sorting be used? - consider each character: have bucket for each alphabet letter TRIE: _____*_____ a/ b\ n/\p l/\o t/ e/\p a/ \o $/ $/ |l c| |k |e\i k| |$ $/|t \c $| |$ |a |t |i |o |n |$ Letters on the edges, special end-of-string character "$" What is time for search(s) and insert(s)? O(|s|) How is prefix-search performed? return all strings in "ap" subtree. Wasteful in terms of pointers (separate node for each character) - so collapse all single child nodes into their parent - get substrings whenever there is no branching E.g., _______*________ a/ b\ nt$/\p lack$/\ook$ e$/\pl e|\ication$ $/|t$ ********** IV. Splay Trees Binary search trees we have looked at thus far: - search(k) leaves the tree unchanged - insert(k) may perform rotations to rebalance, but basically key tends to end up at a leaf in the tree. Why might this be bad? No improvements for repeated search of same key. Also faster search for first nodes inserted, not the most-recently inserted. How can this be improved? Rotate searched/inserted key to the top of the tree! Cost? More rotations... But O(lg n) time ("amortized") Splay trees heavily used. Danny Sleator and Bob Tarjan just won the Kannelakis award for splay trees (1983 paper). All operations based on a single "splay" operation. We will describe bottom-up splaying. Every time a node is accessed in a splay tree, it is moved to the root of the tree, using the splay operation. The splay(x) operation is a sequence of rotations, ending when x is the root of the tree. Each rotation is one of the following: zig (terminal case): y x / ==> \ x y zig-zag: z z / / x y ==> x ==> / \ \ / y z x y zig-zig: z x / y \ y ==> / \ ==> y / x z \ x z Which is different from normal rotation of x to the root? zig-zig ********** E.g., 10 / 8 / \ 2 9 / \ 1 6 / \ 4 7 / \ 3 5 splay(3) zig-zig: zig-zag: zig: 10 10 10 10 3 / / / / / \ 8 8 8 3 2 10 / \ / \ / \ / \ / / 2 9 2 9 3 9 2 8 1 8 / \ / \ / \ / / \ / \ 1 4 1 3 2 4 1 4 9 4 9 / \ \ / \ \ \ 3 6 4 1 6 6 6 / \ \ / \ / \ / \ 5 7 6 5 7 5 7 5 7 / \ 5 7 ********** search(k): normal binary tree search followed by splay(k) if found insert(k): search + insert + splay delete(k): delete + splay at parent