15-451/651 Algorithms		9/04/13
							RECITATION NOTES

- go over the mini if there were any common problems
- Finish Appendix A on recurrences (whatever didn't get to last time)
- discuss one or more of the problems below
==============================================================

- A problem related to Thurs lecture:
We saw in lecture by a "stack of bricks" argument that a recurrence of
the form: 

  T(n) = n + T(a1 * n) + T(a2 * n) + ... + T(ak * n)

(for constants a1,...,ak) solves to O(n) so long as a1 + ... + ak < 1.

What about a recurrence like:

  T(n) = n^2 + T(a1 * n) + T(a2 * n) + ... + T(ak * n)
or
  T(n) = n^b + T(a1 * n) + T(a2 * n) + ... + T(ak * n)

What conditions on the ai do you need so that this gives O(n^b)?

[Answer we just need a1^b + ... + ak^b < 1.  Let's define "r" to be
this quantity.  Looking at the "stack of bricks" argument, each time
you move down one level, each brick gets replaced by k bricks whose
total is r times its size.  So, the sum of sizes of the bricks at each
level is r times the sum at the previous level.  This means the total
work down is a decreasing geometric series.]


- Here are a few possible problems to do related to Tues lecture


Problem 1: Sorting by swaps.  Imagine a sorting algorithm that somehow
picks two elements that are out of order with respect to each other
(not necessarily adjacent, as in insertion-sort) and swaps them.
Question: can we argue that such a procedure (no matter how stupidly
it picks the elements to swap so long as they were out of order) has
to terminate (i.e., the number of swaps it will perform is bounded)?
To do this, one good way is to find some finite, non-negative quantity
that is reduced with each swap.  Such a thing is often called a
"potential function" and we will discuss more about these later.  Any
ideas?  One quantity that works is the total number of pairs of
elements that are out of order wrt each other (this is called the
number of *inversions*).  When a swap is performed, clearly the
inversion of those two is removed, but notice that new inversions may
be created.  (e.g., in [5, 1, 2], swapping 5 and 2 creates a new
inversion between 1 and 2). Can you argue the total number has to go
down with each swap?


Problem 2: Here is a variation on the 20-questions game.  Say I choose
a number between 1 and N and you want to guess it in as few questions
as possible (each time you make an incorrect guess, I'll tell you if
it is too high or too low).  As we all know, the strategy for this
problem that minimizes the worst-case number of guesses is to do
binary search.  But, what if you are only allowed ONE guess that is
too high?  Can you still solve the problem in o(N) guesses?  [If you
were not allowed *ANY* guesses that are too high, the only option you
would have would be to guess 1,2,3,... in order].

Any ideas?  [Another way to state the problem: you want to figure out
how many centimeters high you can drop an egg without it breaking, and
you only have two eggs...]

Here's a strategy: guess 1, sqrt(N), 2*sqrt(N),..., until you get to
some (i+1)*sqrt(N) that's too high.  Now we know the number is between
[i*sqrt(N) and (i+1)*sqrt(N)] so we can finish it off in sqrt(N)
guesses.  Total is at most 2*sqrt(N).

Can you show an Omega(sqrt(N)) lower bound for any deterministic algorithm?
Hint: What if the algorithm makes a guess g_i that is at least sqrt(N)
larger than any previous guess?  What if the algorithm *never* makes
such a guess?

Want to try an upper/lower bound if you're allowed *two* guesses that
are too high?  [Ans: Theta(n^{1/3})]


Problem 3: AVL trees

There's a balanced search tree data structure called "AVL trees" where
at every node in the tree, the height of the left subtree h_L and the
height of the right subtree h_R differ by at most 1 (in other words,
|h_L - h_R| <= 1). We're not going to talk about this particular
method in class, but the high level idea is that if you required the
tree to be *perfectly* balanced, then each insert might force you to
make all sorts of changes in the tree to maintain your invariant; but,
by allowing this slight amount of slop, you can do the updates with
only a constant factor extra work.  It's a little bit like having a
pivot that is "roughly" in the middle if you think of a node as being
the pivot value for its subtree.  The problem is: show that this
guarantees an overall height h(n) =  O(log n) for an n-node tree.

To analyze this, instead of looking at h(n), let's define n(h) to be the
*minimum* number of nodes in a tree of height h. E.g., if we can show that
n(h) >= 2^{h/2}, then that means a tree of n nodes can have height at
most 2*lg(n).

Ideas?  If tree has height h, then at least one of subtrees of the
root has height h-1 (by definition of height) and the other has height
at least h-2 (by the balance property).  So, n(h) > n(h-1) + n(h-2).
This is Fibonacci..., but as a crude bound, we have n(h) > 2*n(h-2).
Every time h goes down by 2, we multiply by 2.  So this gives us n(h)
>= 2^{h/2} if we define a single root with no children as having
height 0 (to get the base case).