RECITATION NOTES 15-451 Algorithms 09/09/09 Some problems for recitation * problem related to Thurs lecture last week. * getting tight upper/lower bounds for some problem * more practice with lower bounds. ======================================================================= Problem related to Thurs lecture: We saw in lecture by a "stack of bricks" argument that a recurrence of the form: T(n) = n + T(a1 * n) + T(a2 * n) + ... + T(ak * n) (for constants a1,...,ak) solves to O(n) so long as a1 + ... + ak < 1. What about a recurrence like: T(n) = n^2 + T(a1 * n) + T(a2 * n) + ... + T(ak * n) or T(n) = n^b + T(a1 * n) + T(a2 * n) + ... + T(ak * n) What conditions on the ai do you need so that this gives O(n^b)? [Answer we just need a1^b + ... + ak^b < 1. Let's define "r" to be the quantity on the LHS. In this case, each time you move down one level of bricks, each brick gets replaced by something r times its size. So, the sum of sizes of the bricks at level j is r times the sum at level j-1. This means the total work down is a decreasing geometric series.] ============================================================================ Here is a variation on the 20-questions problem. Say I choose a number between 1 and N and you want to guess it in as few questions as possible (each time you make an incorrect guess, I'll tell you if it is too high or too low). As we all know, the strategy for this problem that minimizes the worst-case number of guesses is to do binary search. But, what if you are only allowed ONE guess that is too high? Can you still solve the problem in o(N) guesses? [If you were not allowed *ANY* guesses that are too high, the only option you would have would be to guess 1,2,3,... in order]. Any ideas? [Another way to state the problem: you want to figure out how many centimeters high you can drop an egg without it breaking, and you only have two eggs...] Here's a strategy: guess 1, sqrt(N), 2*sqrt(N),..., until you get to some (i+1)*sqrt(N) that's too high. Now we know the number is between [i*sqrt(N) and (i+1)*sqrt(N)] so we can finish it off in sqrt(N) guesses. Total is at most 2*sqrt(N). Can you show an Omega(sqrt(N)) lower bound for any deterministic algorithm? What if the algorithm makes a guess g_i that is at least sqrt(N) larger than any previous guess? What if the algorithm *never* makes such a guess? Want to try an upper/lower bound if you're allowed *two* guesses that are too high? [Ans: Theta(n^{1/3})] ======================================================================= More practice with lower bounds: lower bounds for quicksort variants Last week we looked at two versions of quicksort: (1) pick leftmost element as pivot, and (2) pick *random* element as pivot. We saw there were inputs that could cause the 1st method to make Omega(n^2) comparisons, but proved that for *any* input, expected time of 2nd method is only O(n log n). But might ask: what about a more sophisticated method for picking a pivot? E.g., unix qsort uses "median of 3" rule: look at leftmost, rightmost, and middle element and take their median, then use that as the pivot. Does that get rid of the Omega(n^2) worst case? Claim: Any deterministic algorithm that selects pivots by examining a constant c number of elements and then choosing one of them as pivot will still result in quicksort taking Omega(n^2) comparisons in the worst-case. [So, this is a lower bound for a restricted class of algorithms] Proof: Given some algorithm ALG that chooses pivots by examining <= c elements, we will show that for any n, we (the adversary) can construct an ordering of {1,...,n} that causes ALG to make n^2/c' comparisons, for some constant c'. Here is how we do it. Think of the array A as n boxes. We will now simulate ALG on A, filling in the boxes as we go. Set counter i=1. When ALG examines a box, put i in there, and then increment i. Eventually, after examining <= c elements, ALG picks one as a pivot and splits. Key invariant: all empty boxes will have elements >= the current value of i. So, all empty boxes go into the GREATER pile. Notice that since our procedure only increments i, this invariant is actually correct, and this holds true at every partition step. Since there are at least n-kc empty boxes after kth partition step, the total number of comparisons is at least: (n-c) + (n-2c) + (n-3c) + ... which is > n^2/c' for some constant c'. QED One note: where did we use the fact that the algorithm was deterministic? Ans: we came up with elements as we went along, which for a randomized algorithm would be cheating (the input is supposed to be fixed *first* before the algorithm starts making choices). But the point is that since the alg is deterministic, it will behave exactly the same way if you run it again on whatever inputs we ended up putting in.