15-451/651 Algorithms 9/11/13
RECITATION NOTES
- any questions about lectures?
- discuss one or more of the problems below
==============================================================
Question 1:
-----------
Suppose we want to find a min-cut in a weighted graph, where each edge (x,y)
has a non-negative weight $w(x,y) >= 0$. The goal is now to find a cut with
minimum total weight of edges, instead of the min cardinality set of edges.
Any suggestions for an algorithm for the weighted minimum cut problem.
Modify the Simple MinCut algorithm as follows: We pick a random edge from
the graph, but now the probability of an edge (x,y) being picked is
proportional to w(x,y). Contract it, remove self-loops, and continue.
(If the edge weights were are all 1, we would get our original algorithm.)
The analysis of the unweighted algorithm given in Lecture #4 goes through
unchanged. (For Fact 1, show that the min-weight-cut is at most the weight
of edges incident to any vertex, the "weighted degree". For Claim 5, show
that the total weight of edges must be at least (n-i)k/2, where k is the
weight of the minimum weight cut.)
How fast can this algorithm be implemented?
The O(n^2) time implementation -- where each contraction is implemented in
O(n) time -- goes through unchanged. The extension of the method based on
Kruskal's algorithm and finding MSTs requires a little more effort. (Bonus
problem: can you figure out how to extend the Kruskal-based algorithm to
the weighted setting?)
Question 2: (Going from Monte Carlo to Las Vegas)
-------------------------------------------------
Suppose you had a (Monte Carlo) randomized algorithm WeirdSort that
takes an unsorted array I of elements as input, and outputs another
array O. The algorithm always runs in time T(n), and the output O is
always some permuted version of the input I. Moreover, we know that
Pr[ output O is in sorted order ] >= p.
So the algorithm does the right thing with probability at least p.
Remember: this probability is over the randomness of the algorithm and
holds for every input I.
Question: Prove that you can convert such an algorthm into a (Las Vegas)
randomized sorting algorithm that has expected running time O((T(n)+n) *
1/p).
(Recall that the definition of Las Vegas algorithms means we always
have to output a *sorted* array containing the elements of the input
array A, but now the runtime of the Las Vegas algorithm is a random
variable.)
Any ideas?
The simplest idea works: try the algorithm once, see if the output is
sorted, and if not, repeat the process (with fresh randomness, so that
the outcome is independent of the previous rounds). One "round" of the
algorithm takes T(n) time to run the algorithm once, and O(n) time to
check if the output is sorted. What is the expected number of rounds we
will take before we stop?
Well, each run of the algorithm succeeds (independently of the past
runs) with probabability at least p, so it's as though we are flipping a
coin with bias (probability of heads) at least p and want to know the
expected number of flips before we see a heads. From previous classes,
we know this is at most 1/p.
Multiplying the two, the expected running time is
( T(n) + O(n) ) * (1/p).
[[In case you haven't seen it before, here is a brute force calculation
showing that the expected number of flips before we see a heads is 1/p,
when the bias is p. Let N be the random variable denoting the number of
flips until the first head when flipping a coin with heads probability
exactly p. Then
E[N]
= \sum_{i >= 0} (i * Pr[ we first succeed in the i^{th} round ] )
= \sum_{i >= 0} (i * (1-p)^{i-1} * p ) <---- (!)
But note that (1-p)E[N] is then
= \sum_{i >= 0} (i * (1-p)^{i} * p )
and replacing the index i by j-1, we get
= \sum_{j >= 1} ((j-1) * (1-p)^{j-1} * p )
now replacing the index j by i, we get
= \sum_{i >= 1} ((i-1) * (1-p)^{j-1} * p )
which looks very much like (!).
So E[N] - (1-p)E[N]
= \sum_{i >= 1} ((i - (i-1)) * (1-p)^{i-1} * p )
= \sum_{i >= 1} (1-p)^{i-1} * p
This is just the geometric sum, and is sums to 1.
Whew. Now that the dust has settled, p.E[N] = 1, or E[N] = 1/p. And if
the heads probability is at least p, the expected number of flips
becomes at most 1/p. ]]
Question 2b.
------------
What happens if you use this idea to try and convert the Monte Carlo
Simple-MinCut algorithm from Lecture #4 with run-time R(n) and success
probability p, into a Las Vegas algorithm with expected run-time
O((R(n) + n) * 1/p)?
Hmm.
The main idea above was that we could check that the solution was sorted
in linear time. In the min-cut case, if we had a "verification"
algorithm that ran in time V(n) and told us if the output cut was a
minimum cut or not, we could use the same idea to get a Las Vegas
algorithm with running time
(R(n) + V(n)) * (1/p).
[Note this verification algorithm must itself always be correct, so it
should either be a Las Vegas algorithm or a deterministic algorithm.]
But we don't know a faster verification algorithm. We don't know how to
check whether a proposed cut is a min-cut in time faster than finding a
min-cut.
Question 3: The Return Trip back to Monte Carlo
-----------------------------------------------
Remember Markov's Inequality? It says that more than half the population
cannot be more than twice the average height. Or more than a tenth of
the population can earn more than 10 times the average salary.
More formally:
For any non-negative random variable X, let its expectation be $a$.
In math, E[X] = a. Then for any $c > 0$,
Pr[ X >= c*a ] <= 1/c.
Simple, but surprisingly powerful.
Remember the proof? Super simple. Let p be this probability we
want to bound. And suppose whenever X takes on values < c*a, imagine
it is 0. When it takes on values >= c*a, imagine it is equal to c*a.
This can only cause the average to fall, since X was non-negative.
Now, by the definition of average, (1-p)*0 + p*c*a <= a. So p <= 1/c.
Good.
Question: Show that if you have a Las Vegas algorithm with expected
running time at most T(n), then you can get a Monte Carlo algorithm with
a) worst-case running time at most 2T(n) and
b) probability of success at least 1/2.
Ideas? Anyone? Beuller?
Consider the algorithm: run the L-V algorithm for 2T(n) steps, and if it
has not stopped yet, just abort it. Trivially, the running time is now
2T(n). What is the chance it was aborted? This equals
Pr[ the algorithm did not stop by twice its expected run-time ]
<= 1/2, by Markov's inequality.
[[Gory details: What's the random variable? X = run-time of the algo.
Clearly a non-negative r.v.. Also, we are given that E[X] <= T(n).
Therefore
Pr[ X > 2T(n) ] <= Pr[ X > 2E[X] ] <= 1/2.
]]
Piece of cake.
Question 4: Amortized non-space-hogging arrays
----------------------------------------------
In lecture we discussed the "implementing a stack as an array"
problem, and decided that doubling the array whenever it gets full
does well in an amortized sense. Now, what if we don't want to be
space hogs. If we do a lot of pops and the stack gets small, then we
want to resize the array to make it smaller.
Say k = number of elements and L = size of array. Cost for performing
a resize operation is equal to the number of elements moved.
Strategy #1: when we pop from k = L/2, then cut the array size
in half. Is this a good strategy? [no] Why or why not??
How about cutting in half when pop from k=L/4?
Claim: amortized cost is at most 3 per operation.
Proof: Consider the interval between successive array resizes. The
important point is that after each resize operation, the array is
exactly half full.
If it's a double, then there must have been at least
L/2 pushes since the last resize. This can pay for the work.
If it's a halving, then there must have been at least
L/4 pops, and these can be used to pay for the resize.
If you want to use the bank-account method, you can think of each
push or pop as paying $1 to perform the operation and putting $2 in
the bank to pay for a possible future resizing.