-------------------------------------------------------------------------
Notes on Union Find Ackermann Analysis 
 based on a lecture by Danny Sleator, 
  based on a talk by Raimund Seidel, and analysis by Seidel and Sharir

scribe: Anupam Gupta
-------------------------------------------------------------------------

First, think of the basic operations as being 
1. union(u,v): unions the trees rooted at u and v together.
2. rootof(x): returns y, the root of x's tree at that time
3. path-compress(x,y): compresses the path between x and some ancestor y.

The usual way we implement find(x) is
 sub find(x) {
   y <- rootof(x);
   path-compress(x,y);
 }


All we need to do is analyse the work done. So we ignore the rootof()
operations done during the finds for the analysis' sake. Since the cost
of the rootof() is the same as the cost of the following path-compress
(PC) operation, we assume we want to bound the cost of the unions and PC
operations alone. The actual cost is higher only by a factor of 2.

Note that the cost of a union is constant.

Moreover, we define cost(PC(x,y)) as the number of nodes whose parents
were reassigned during this PC operation. Note that if the PC operation
was on a single edge, we would count 0 cost, whereas there might be unit
cost.  But the loss in this is only O(# of PC operations), and we can
add back this term at the end, increasing the cost by at most O(m).

--------------------------------------------------------------

So really, how do we bound the cost of these PC ops????

--------------------------------------------------------------
Step I
======

Since these unions make things complicated, let us go from having 
  World A: m unions and PC's arbitrarily interspersed
to
  World B: all the unions first and then the PC's follow.

For this reordering to make sense, we should make sure that

1. for any PC(x,y) operation that made sense in world A, it also
  makes sense in world B. I.e., y is still x's ancestor. 
2. the cost of each PC(x,y) operation in world B has the same cost 
  as the cost in world A. 

 Proofs: deferred.

 QED.

--------------------------------------------------------------
Step II: 
=======

Now that the unions all come first, let us fix the forest F we get after
all the unions are done. (The work done until now for the unions is O(1)
per operation.) Now we have to analyze the cost incurred for the
sequence $C$ of $m$ PC operations to come.

Since the cost of the PC operations on the forest is the sum of the PC
operations on each tree, let us focus on a single tree T, with nodes X.

Mark some of X as the top nodes Xt and the rest as the bottom nodes
Xb. This should be done in such a way that the parent of a top node is
another top node. ("Xt is upwards-closed" for all us math-terminology
fans.) Either of these sets may be empty.

Now this breaks the tree into another forest: the nodes in Xt induce a
tree Tt, and the nodes in Xb induce a forest Fb. We create a new node
N, and attach the roots of all the trees in the forest Fb to N to get
the tree Tb.

Example here.

--------------------------------------------------------------
Why do we do this?

 So that we can now write a divide-and-conquer recurrence that will,
 at the end, give us the Ackermann-style bound we're looking for.

--------------------------------------------------------------
Some proofs now. First make sure that the properties of Xt and Xb do
not change as we apply the PC operations to T.

Lemma: If Xt is upwards-closed in T, and if T' is the tree obtained by
performing a PC operation on T, then Xt is upwards closed in T'.

 Proof. deferred

 QED.

Now consider each of the PC operations on T and see how T evolves as we
perform these operations. Just before we apply PC(x,y) on the current
version of tree T we write down (at most two) operations that work
on the trees Tt and Tb. 

a) if x,y in Xt, then generate PC(x,y), a valid op for Tt.
b) if x,y in Xb, then generate PC(x,y), a valid op for Tb.
c) if x in Xb, y in Xt, then consider the xy path in the current tree T.
    let y' be the first node in Xt on the path (maybe the same as y).
    Generate PC(y', y) on Tt.
    Generate PC*(x, N) on Tb.
    Charge one token.

Nb. PC* works the same as PC, but we consider it separately it for 
charging purposes. These were the "root finds" that we were doing in
lecture, but we've stated it this way in the notes to make things 
explicit.

------------------------------------------------------------
Why are we doing this?

 We want to claim that the work done by the new ops on Tt and Tb
 is at least the work done by the old ops on T. So if we could bound
 the latter we'd be done.

 This will be almost true, with a couple of fudge-factors.

------------------------------------------------------------

Let C be the set of PC ops performed on C
    Ct be the PC ops generated for Tt, Cb be PC ops generated for Tb
    C* be the PC* ops genearated (all for Tb).

Lemma: the number of PC operations performed on T is the same as the 
 number of PC operations performed on Tt and Tb together. I.e.,
   |C| <= |Ct| + |Cb|

 Proof: immediate. 
   QED.

Lemma: the number of tokens used is at most |Ct|

 Proof: every time we generate a token, we generate an op in Ct.
   QED.

Lemma: the number of edges on paths given to the PC* ops over the 
  run of the algorithm is at most |Xb| + number of tokens <= |Xb| + |Ct|

 Proof: Note that whenever we do PC*(x,.), the second argument is N, 
   and this makes a bunch of nodes in Xb now point to N.  In T, all 
   these nodes now have a parent in Xt. For each node in Xb that is
   made to point to N for the first time, charge this operation to 
   the node itself. However, there can be (at most) one node on this 
   path in Tb that was already pointing to N, this must be the node 
   that (in T) had a parent in Xt. This one edge is charged to the 
   token.
  QED.

Lemma: cost_T(C) <= cost_Tt(Ct) + cost_Tb(Cb) + |Ct| + |Xb|

 Proof: the number of parent reassignments on T with a PC op is
   the number of edges on the path minus 1. If the PC op corresponds
   to Cases (a) or (b) above, the cost incurred is the same on both
   sides. The interesting case is (c): here the two paths in Tt and Tb
   have total length the same as the the original path in T. The PC op
   in Tt charges the length of the Tt-path minus 1. Hence, if the right
   side increases by at least the length of the Tb path, we'd be done.
   But by the lemma above, the total lengths of the Tb paths given to
   PC* ops is at most |Xb| + |Ct|, which completes the claim.
  QED.

--------------------------------------------------------------

So we've got what we wanted: the cost of a sequence of PC ops on T can
be bounded by the PC ops on Tt and Tb, plus the fudge factors of |Xb|
and |Ct|.

What good will this do us?

 A lot! We will choose these Xt and Xb sets carefully to get the 
 result we want.

--------------------------------------------------------------
But First, An Easy Result

Let's show that even if we do not do union-by-rank, path compression 
gives us amortized O(log n) performance.

Claim: For any sequence C of m PC operations on n = |X| items.
   cost(C) <= (m + n/2) log_2 n.

 Proof: By induction on n. For n = 1, cost(C) = 0.

   Let Xt and Xb be any valid partition of X such that Xt is
   upwards closed, and |Xt| = |Xb| = n/2. Now recall that
    cost(C) <= cost(Ct) + cost(Cb) + |Xb| + |Ct|  (*)

   by IH: cost(Ct) <= (|Ct| + |Xt|/2) log |Xt|
          cost(Cb) <= (|Cb| + |Xb|/2) log |Xb|
          |Xb|      = n/2
          |Ct|     <= |C| = m

   Hence the RHS of (*) is at most
      (|Ct| + |Cb| + |Xt|/2 + |Xb|/2) (log |X| - 1) + n/2 + m
   <= (m + n/2) log |X|

  QED.

--------------------------------------------------------------