15-451/651 Algorithms 9/04/13 RECITATION NOTES - go over the mini if there were any common problems - Finish Appendix A on recurrences (whatever didn't get to last time) - discuss one or more of the problems below ============================================================== - A problem related to Thurs lecture: We saw in lecture by a "stack of bricks" argument that a recurrence of the form: T(n) = n + T(a1 * n) + T(a2 * n) + ... + T(ak * n) (for constants a1,...,ak) solves to O(n) so long as a1 + ... + ak < 1. What about a recurrence like: T(n) = n^2 + T(a1 * n) + T(a2 * n) + ... + T(ak * n) or T(n) = n^b + T(a1 * n) + T(a2 * n) + ... + T(ak * n) What conditions on the ai do you need so that this gives O(n^b)? [Answer we just need a1^b + ... + ak^b < 1. Let's define "r" to be this quantity. Looking at the "stack of bricks" argument, each time you move down one level, each brick gets replaced by k bricks whose total is r times its size. So, the sum of sizes of the bricks at each level is r times the sum at the previous level. This means the total work down is a decreasing geometric series.] - Here are a few possible problems to do related to Tues lecture Problem 1: Sorting by swaps. Imagine a sorting algorithm that somehow picks two elements that are out of order with respect to each other (not necessarily adjacent, as in insertion-sort) and swaps them. Question: can we argue that such a procedure (no matter how stupidly it picks the elements to swap so long as they were out of order) has to terminate (i.e., the number of swaps it will perform is bounded)? To do this, one good way is to find some finite, non-negative quantity that is reduced with each swap. Such a thing is often called a "potential function" and we will discuss more about these later. Any ideas? One quantity that works is the total number of pairs of elements that are out of order wrt each other (this is called the number of *inversions*). When a swap is performed, clearly the inversion of those two is removed, but notice that new inversions may be created. (e.g., in [5, 1, 2], swapping 5 and 2 creates a new inversion between 1 and 2). Can you argue the total number has to go down with each swap? Problem 2: Here is a variation on the 20-questions game. Say I choose a number between 1 and N and you want to guess it in as few questions as possible (each time you make an incorrect guess, I'll tell you if it is too high or too low). As we all know, the strategy for this problem that minimizes the worst-case number of guesses is to do binary search. But, what if you are only allowed ONE guess that is too high? Can you still solve the problem in o(N) guesses? [If you were not allowed *ANY* guesses that are too high, the only option you would have would be to guess 1,2,3,... in order]. Any ideas? [Another way to state the problem: you want to figure out how many centimeters high you can drop an egg without it breaking, and you only have two eggs...] Here's a strategy: guess 1, sqrt(N), 2*sqrt(N),..., until you get to some (i+1)*sqrt(N) that's too high. Now we know the number is between [i*sqrt(N) and (i+1)*sqrt(N)] so we can finish it off in sqrt(N) guesses. Total is at most 2*sqrt(N). Can you show an Omega(sqrt(N)) lower bound for any deterministic algorithm? Hint: What if the algorithm makes a guess g_i that is at least sqrt(N) larger than any previous guess? What if the algorithm *never* makes such a guess? Want to try an upper/lower bound if you're allowed *two* guesses that are too high? [Ans: Theta(n^{1/3})] Problem 3: AVL trees There's a balanced search tree data structure called "AVL trees" where at every node in the tree, the height of the left subtree h_L and the height of the right subtree h_R differ by at most 1 (in other words, |h_L - h_R| <= 1). We're not going to talk about this particular method in class, but the high level idea is that if you required the tree to be *perfectly* balanced, then each insert might force you to make all sorts of changes in the tree to maintain your invariant; but, by allowing this slight amount of slop, you can do the updates with only a constant factor extra work. It's a little bit like having a pivot that is "roughly" in the middle if you think of a node as being the pivot value for its subtree. The problem is: show that this guarantees an overall height h(n) = O(log n) for an n-node tree. To analyze this, instead of looking at h(n), let's define n(h) to be the *minimum* number of nodes in a tree of height h. E.g., if we can show that n(h) >= 2^{h/2}, then that means a tree of n nodes can have height at most 2*lg(n). Ideas? If tree has height h, then at least one of subtrees of the root has height h-1 (by definition of height) and the other has height at least h-2 (by the balance property). So, n(h) > n(h-1) + n(h-2). This is Fibonacci..., but as a crude bound, we have n(h) > 2*n(h-2). Every time h goes down by 2, we multiply by 2. So this gives us n(h) >= 2^{h/2} if we define a single root with no children as having height 0 (to get the base case).