15-451 Algorithms
Fall 2012
D. Sleator
Augmenting Binary Trees September 27, 2012
-- insertion/deletion/split/join in splay trees
-- augmenting trees
* rank/select
* segment trees
* reversal and simulating pancake sorting
see http://www.codeforces.com/contest/117/submission/860934
for sample splay tree code
Also at the end of this lecture.
----------------------------------------------------------------------------
How to use splaying to do other important operations on trees
-------------------------------------------------------------
Insertion (see last lecture notes)
Deletion (see last lecture notes)
split(S, k) S is a set reprecented by a splay tree. k is a key.
This returns two sets S1 of keys <= k and S2 of
keys > k.
This is implemented as follows. First search for key k
in the tree. If it is there, we splay it. Now detach the
left subtree of the root and that is S1. And the rest is S2.
If k is not in the tree, then let m be the last key found
before we went off the bottom of the tree. We know that
m is either the smallest key > k in the set or the largest
key < k in the set. So we splay m, and separate the tree
by removing m's left or right subtree, which ever is
appropriate.
This is amortized O(log n). The search for k is paid
for by the splay. The subsequent cutting of links only
reduces the potential.
Join(S,T) Here S and T are trees representing sets.
All the keys in S are less than all the
keys in T. This forms the union of these
two sets and returns it.
It works as follows. Let k be the largest key in S.
(Find it by walking down right children until there
is none.) Now do Splay(S,k). Now simply make T
the right child of the root of the new S.
This is amortized O(log n). The work of finding
k is paid for by the splay. The linking of T to
S can cause the potential of the root of S to increase
by at most log(n).
Augmenting Binary Search Trees
==============================
In this lecture we will discuss how to add extra fields
to binary search trees to give them additional functionality.
We'll work in the context of splay trees, although most of
the ideas carry over to other types of search trees.
Rank/Select
-----------
Suppose we want to represent a set of keys and support
the following operations:
rank(S,k): return the rank of key k. The smallest
element in the set has rank 0, the largest
has rank n-1 (where n is the size of the set).
select(S,i): return the ith key in the set.
insert(S,k): if k is not already in the set, then
insert k into the set.
delete(S,k): if k is in the set, delete it.
The simplest way to do this is to add an extra field to
the nodes. So in addition to left and right, we'll also
store the *size* of the node. If n is a node, then n.size
is the number of nodes in the subtree rooted there.
We need to show how to implement all of the operations and
to correctly and efficiently maintain the size fields of
all the nodes.
Let's denote by n.l and n.r the left and right children
of a node n. We can use the following fact to help us
keep the sizes:
n.size = n.l.size + n.r.size + 1
(The null nodes have size 0.)
When we do a rotation, the only nodes whose sizes change
are the two involved in the rotation.
y x
right rotation: / ====> \
x y
To update the sizes after the rotation, we first fix y
(by applying the above formula) and then we fix x.
This allows us to splay while maintaining the size fields.
It remains to show how to implement rank and select.
Here's how select(S,i) works. In general we're at some
node and we're looking for the ith smallest key in the
subtree rooted there. We look at the size of the left
subtree. If it's i-1 then the current node is the one we
want. If it's less than this then we need to go down to
the right. If it's greater than this, we need to go down
to the left. When we go to the right we must adjust the
index we're looking for because of the stuff we just
passed (the current node and the left subtree).
Note that after finding the appropriate node, we must
now splay this node. It's required to do this to pay
for the work we just did to find it.
(Do example)
For rank(s,k), we first find key k in the tree. We do
this by a normal search in the tree. After we find it,
we splay it to the root. It's rank is the size field
of the left child of the root. (Again, splaying is
necessary to pay for the work. It also makes it simpler
to compute the rank.)
Segment Trees
-------------
The problem here is to maintain a dynamically changing collection of
(possibly overlapping) segments on a line. It can be viewed as a
1-D geometry problem.
insert(S,lo,hi) Insert a new segment [lo,hi] into the set
delete(S,lo,hi) Delete the given segment from the set
find(S,x) Determine if x is in one of the segments
of S. If it is, return a segment containing x.
We'll solve this problem using a splay tree where there's one
node corresponding to each segment in the set. The tree is ordered
by the lo values of the segments.
To implement find we're going to use the split operation of splay
trees. This will let us split the segments into two sets. The
set with left endpoints which are <= x and the ones with left
endpoints > x.
Let L be the set with left endpoints <= x. The point x will be in one
of the segments if and only if some element of L has a right endpoint
that is >= x.
It will be possible to do this if we were able to keep the maximum
of all the right endpoints in a subtree rooted at a node. So this is
the auxiliary information that we will keep. So in addition to
keeping [lo,hi] in each node, we add a new field that is the maximum
value of hi in any node in the subtree rooted here. (See example
on page 313 of CLRS.)
So in a node n, let n.m be the maximum value of the hi fields in
in all the nodes in the subtree rooted here. Given that the left
and right children are set correctely, here's how we can set n.m:
n.m = max (n.l.m, n.r.m, n.hi)
(Where the m value of a NULL is -infinity.)
So here's how find(S,x) works. Split the tree using key x.
Let L be the resulting left set with root r. If r.m < x
then we know there is no interval containing x. If r.m >= x
then see which of the left subtree of r, the node r, or the
right subtree of r contains an interval containing x. Continue
traversing down the tree in this fashion until we come to a
node with an interval containing x.
When we find this node, we have to splay it, of course, and
then glue the tree back together to preserve the interval
set.
It's easy to use the formula for n.m above to keep the m fields
correct whenever we do a rotation. (Just like the size fields.)
Insert and Delete are now standard.
Pancake sorting
---------------
You probably recall from 15-251 how to sort by prefix reversal
(also known as pancake flipping).
Here's a problem we want to solve. Given an arbitrary permutation
of numbers 1...n (a list of numbers) we want to sort them by reversing
prefixes. For example, say we start with:
5 1 4 2 3
Reverse the whole thing and get:
3 2 4 1 5
Now put 4 at the front:
4 2 3 1 5
Now put 4 in place:
1 3 2 4 5
Now put 3 in front:
3 1 2 4 5
Not put 3 in place:
2 1 3 4 5
finish up:
1 2 3 4 5
Clearly this algorithm takes at most 2(n-1) flips. But suppose
you wanted to print out the set of prefix lengths that are flipped
in order to sort the permutation. So in this case we would output
5 3 4 2 3 2
The naive algorithm to compute this will take time O(n^2), because it
has to maintain, say, an array of all the elements, and each prefix
reversal takes O(n) time and there are O(n) of them. Not to mention
searching for the next thing that needs to be put in place.
We'll show how to use augmented splay trees to do this in O(n log n)
time.
Since the tree is not going to keep the keys in symmetric order (left
to right) we need a way to find a given key in the tree. To
facilitate this we create an array A where A[i] is a pointer to
the node in the tree that stores i. This array does not change over
time.
To allow for prefix reversal, we will keep what I call a reverse
bit r in each node. The meaning of this bit is the following.
If n.r is true, then the meaning of left and right in the subtree
rooted at n are reversed. And this applies recursively to all the
reverse bits in the tree.
(Example)
So, if i have a tree that represents the sequence 3 2 1 4, and
I toggle the reverse bit of the root of the tree, then I am
representing 4 1 2 3.
If I have a node n where n.r=true, and can "discharge" this
reversal by swapping the left and right children of n and
then toggling the reverse bits in the left and right children.
This follow from the following identity for strings. If A and
B are strings and AB is the concatenation of A and B we have:
reverse(AB) = reverse(B) reverse(A)
How do we implement splaying in a tree with reverse bits?
The shape of paths is of course important to the splaying
algorithm. So what we can do is before we splay along
a path in the tree, we first discharge all the reverse bits
along that path. Then we can use the standard splaying
algorithm. Every node to which a rotation is applied has
its reverse bit = false.
We're also going to need to maintain size fields as we did for
the rank/select problem.
Here's how we can compute the prefix reversal sequence in
O(n log n) time.
First we construct an initial tree to represent the initial
sequence. We also construct the array A[] described above.
Now we find key n in the tree by looking up A[n]. This points to a
node. We splay this node. Now n is at the root of the tree. The
size field of the left subtree of the root tells us the index of n.
We split off n and its left subtree from the rest. Now we reverse
that part (by toggling its reverse bit), and join them together again.
Now n is at the left. And we now toggle the reverse bit of the whole
thing (and output the size of the tree). n is now at the right end.
So we delete n from the tree, and repeat the whole process for n-1.
The process of handling n is a split and a join, which is O(log n).
Thus the whole algorithm is O(n log n).
-------------------Java implementation of this-----------------
import java.io.*;
import java.util.*;
/*
This program takes an array which contains a permutation of
0...n-1, and computes a sequence of prefix reversals that sorts
the permutation.
It's set up to take a permutation on the command line, like this:
$ java Pancake 5 4 1 2 0 3
6
4
2
3
The output is a sequence of prefixes to reverse in order to sort
the permutation.
D. Sleator, September 2012
*/
public class Pancake {
public static void main(String[] args) {
int n = args.length;
int[] A = new int[n];
for (int i=0; i=0; k--) {
Node x = ptr[k];
Splay.splay(x);
int j = (x.l==null)?0:x.l.size;
if (j!=k) { // if j=k then k is already in place
if (j>0) { // move k to the front
moves[m++] = j+1;
Node b = x.r;
b.p = null;
x.r = null;
x.update();
x.flip = !x.flip;
Splay.join(x, b);
Splay.splay(x);
}
moves[m++] = k+1;
x.flip = !x.flip; // move k to the back
}
Splay.delete(x);
}
int[] ret = new int[m];
for (int i=0; i