15-451/651 Algorithms 9/11/13 RECITATION NOTES - any questions about lectures? - discuss one or more of the problems below ============================================================== Question 1: ----------- Suppose we want to find a min-cut in a weighted graph, where each edge (x,y) has a non-negative weight $w(x,y) >= 0$. The goal is now to find a cut with minimum total weight of edges, instead of the min cardinality set of edges. Any suggestions for an algorithm for the weighted minimum cut problem. Modify the Simple MinCut algorithm as follows: We pick a random edge from the graph, but now the probability of an edge (x,y) being picked is proportional to w(x,y). Contract it, remove self-loops, and continue. (If the edge weights were are all 1, we would get our original algorithm.) The analysis of the unweighted algorithm given in Lecture #4 goes through unchanged. (For Fact 1, show that the min-weight-cut is at most the weight of edges incident to any vertex, the "weighted degree". For Claim 5, show that the total weight of edges must be at least (n-i)k/2, where k is the weight of the minimum weight cut.) How fast can this algorithm be implemented? The O(n^2) time implementation -- where each contraction is implemented in O(n) time -- goes through unchanged. The extension of the method based on Kruskal's algorithm and finding MSTs requires a little more effort. (Bonus problem: can you figure out how to extend the Kruskal-based algorithm to the weighted setting?) Question 2: (Going from Monte Carlo to Las Vegas) ------------------------------------------------- Suppose you had a (Monte Carlo) randomized algorithm WeirdSort that takes an unsorted array I of elements as input, and outputs another array O. The algorithm always runs in time T(n), and the output O is always some permuted version of the input I. Moreover, we know that Pr[ output O is in sorted order ] >= p. So the algorithm does the right thing with probability at least p. Remember: this probability is over the randomness of the algorithm and holds for every input I. Question: Prove that you can convert such an algorthm into a (Las Vegas) randomized sorting algorithm that has expected running time O((T(n)+n) * 1/p). (Recall that the definition of Las Vegas algorithms means we always have to output a *sorted* array containing the elements of the input array A, but now the runtime of the Las Vegas algorithm is a random variable.) Any ideas? The simplest idea works: try the algorithm once, see if the output is sorted, and if not, repeat the process (with fresh randomness, so that the outcome is independent of the previous rounds). One "round" of the algorithm takes T(n) time to run the algorithm once, and O(n) time to check if the output is sorted. What is the expected number of rounds we will take before we stop? Well, each run of the algorithm succeeds (independently of the past runs) with probabability at least p, so it's as though we are flipping a coin with bias (probability of heads) at least p and want to know the expected number of flips before we see a heads. From previous classes, we know this is at most 1/p. Multiplying the two, the expected running time is ( T(n) + O(n) ) * (1/p). [[In case you haven't seen it before, here is a brute force calculation showing that the expected number of flips before we see a heads is 1/p, when the bias is p. Let N be the random variable denoting the number of flips until the first head when flipping a coin with heads probability exactly p. Then E[N] = \sum_{i >= 0} (i * Pr[ we first succeed in the i^{th} round ] ) = \sum_{i >= 0} (i * (1-p)^{i-1} * p ) <---- (!) But note that (1-p)E[N] is then = \sum_{i >= 0} (i * (1-p)^{i} * p ) and replacing the index i by j-1, we get = \sum_{j >= 1} ((j-1) * (1-p)^{j-1} * p ) now replacing the index j by i, we get = \sum_{i >= 1} ((i-1) * (1-p)^{j-1} * p ) which looks very much like (!). So E[N] - (1-p)E[N] = \sum_{i >= 1} ((i - (i-1)) * (1-p)^{i-1} * p ) = \sum_{i >= 1} (1-p)^{i-1} * p This is just the geometric sum, and is sums to 1. Whew. Now that the dust has settled, p.E[N] = 1, or E[N] = 1/p. And if the heads probability is at least p, the expected number of flips becomes at most 1/p. ]] Question 2b. ------------ What happens if you use this idea to try and convert the Monte Carlo Simple-MinCut algorithm from Lecture #4 with run-time R(n) and success probability p, into a Las Vegas algorithm with expected run-time O((R(n) + n) * 1/p)? Hmm. The main idea above was that we could check that the solution was sorted in linear time. In the min-cut case, if we had a "verification" algorithm that ran in time V(n) and told us if the output cut was a minimum cut or not, we could use the same idea to get a Las Vegas algorithm with running time (R(n) + V(n)) * (1/p). [Note this verification algorithm must itself always be correct, so it should either be a Las Vegas algorithm or a deterministic algorithm.] But we don't know a faster verification algorithm. We don't know how to check whether a proposed cut is a min-cut in time faster than finding a min-cut. Question 3: The Return Trip back to Monte Carlo ----------------------------------------------- Remember Markov's Inequality? It says that more than half the population cannot be more than twice the average height. Or more than a tenth of the population can earn more than 10 times the average salary. More formally: For any non-negative random variable X, let its expectation be $a$. In math, E[X] = a. Then for any $c > 0$, Pr[ X >= c*a ] <= 1/c. Simple, but surprisingly powerful. Remember the proof? Super simple. Let p be this probability we want to bound. And suppose whenever X takes on values < c*a, imagine it is 0. When it takes on values >= c*a, imagine it is equal to c*a. This can only cause the average to fall, since X was non-negative. Now, by the definition of average, (1-p)*0 + p*c*a <= a. So p <= 1/c. Good. Question: Show that if you have a Las Vegas algorithm with expected running time at most T(n), then you can get a Monte Carlo algorithm with a) worst-case running time at most 2T(n) and b) probability of success at least 1/2. Ideas? Anyone? Beuller? Consider the algorithm: run the L-V algorithm for 2T(n) steps, and if it has not stopped yet, just abort it. Trivially, the running time is now 2T(n). What is the chance it was aborted? This equals Pr[ the algorithm did not stop by twice its expected run-time ] <= 1/2, by Markov's inequality. [[Gory details: What's the random variable? X = run-time of the algo. Clearly a non-negative r.v.. Also, we are given that E[X] <= T(n). Therefore Pr[ X > 2T(n) ] <= Pr[ X > 2E[X] ] <= 1/2. ]] Piece of cake. Question 4: Amortized non-space-hogging arrays ---------------------------------------------- In lecture we discussed the "implementing a stack as an array" problem, and decided that doubling the array whenever it gets full does well in an amortized sense. Now, what if we don't want to be space hogs. If we do a lot of pops and the stack gets small, then we want to resize the array to make it smaller. Say k = number of elements and L = size of array. Cost for performing a resize operation is equal to the number of elements moved. Strategy #1: when we pop from k = L/2, then cut the array size in half. Is this a good strategy? [no] Why or why not?? How about cutting in half when pop from k=L/4? Claim: amortized cost is at most 3 per operation. Proof: Consider the interval between successive array resizes. The important point is that after each resize operation, the array is exactly half full. If it's a double, then there must have been at least L/2 pushes since the last resize. This can pay for the work. If it's a halving, then there must have been at least L/4 pops, and these can be used to pay for the resize. If you want to use the bank-account method, you can think of each push or pop as paying $1 to perform the operation and putting $2 in the bank to pay for a possible future resizing.