15-451 Algorithms 10/06/10
RECITATION NOTES
* Maximum bottleneck path
* Coupon-collectors problem
=======================================================================
In class we saw a couple greedy tree-growing algorithms for different
problems.
*Dijkstra for computing shortest path tree*
initialize:
- tree = {start}. No edges. label start as having distance 0.
[invariant: nodes in tree have correct distance to start.]
repeat:
- for each nbr x of tree compute (over)-estimate of distance to start:
distance(x) = min [ distance(v) + length(e) ]
edges e=(v,x)
v in tree
[by our invariant, this is the length of the shortest path to x
whose only edge not in the tree is the very last edge.]
- put x of minimum distance into tree, connecting it via the argmin
(the edge e used to get distance(x))
*Prim for computing minimum spanning tree*
- just like Dijkstra but remove "distance(v)" in the distance(x)
calculation. I.e., it's just the length of the shortest edge
connecting x to the tree.
Here is a different problem, called the maximum bottleneck path
problem.
MAXIMUM BOTTLENECK PATH
-----------------------
Suppose we think of edge weights as "widths" rather than "lengths" and
we want to find the widest path from s to everwhere in the graph. Say
we define the width of a path as the minimum width of all edges on the
path and let's use widthto(v) to denote the width of the widest path
from s to v. E.g., think of edge weights as capacities (number of
megabits/second you can send on a link).
How can we modify Dijkstra's algorithm to solve this problem?
Initialization: set widthto(s) = infinity.
general step:
widthto(x) = max [ min [ widthto(v), length(e)] ]
edges e=(v,x)
v in tree
and now put the node x of *maximum* widthto(x) into the tree.
-------------------------------------------
Coupon-collector's problem
==========================
This is a problem that's a good one for practice with probability and
linearity of expectation.
The problem is: suppose we have n boxes (let's label them 1,...,n)
and we repeatedly throw balls into the boxes at random. What is the
expected number of throws until every box has at least one ball in it?
Answer: nH_n
(where H_n = 1+1/2+1/3+...+1/n)
The proof is as follows. Imagine a counter (starting at 0) that tells
us how many boxes have at least one ball in it. Let X_1 denote the
number of throws until the counter reaches 1 (so X_1 = 1). Let X_2
denote the number of throws from that point until the counter reaches
2. In general, let X_k denote the number of throws made from the time
the counter hit k-1 up until the counter reaches k.
So, the total number of throws is X_1 + ... + X_n, and by linearity of
expectation, what we are looking for is E[X_1] + ... + E[X_n].
How to evaluate E[X_k]? Suppose the counter is currently at k-1.
Each time we throw a ball, the probability it is something new is
(n-(k-1))/n. So, another way to think about this question is as follows:
Coin flipping: we have a coin that has probability p of coming up
heads (in our case, p = (n-(k-1))/n). What is the expected number
of flips until we get a heads?
It turns out that the "intuitively obvious answer", 1/p, is correct.
But why? Here is one way to see it: if the first flip is heads,
then we are done; if not, then we are back where we started, except
we've already paid for one flip. So the expected number of flips E
satisfies: E = p*1 + (1-p)*(1 + E). You can then solve for E = 1/p.
The above argument is a little tricky, so if you want, you can draw
a picture of a tree like this, where what we are looking for is the
sum, over all leaves, of the depth of the leaf times the probability
of reaching that leaf.
*
p/ \(1-p)
* *
p/ \(1-p)
* *
...
And if we let T denote that tree, the argument is saying just that one
simple way of analyzing T is to view T as:
*
p/ \(1-p)
* T
Putting this all together, let CC(n) be the expected number of
throws until we have filled all the boxes. We then have:
CC(n) = E[X_1] + ... + E[X_n]
= n/n + n/(n-1) + n/(n-2) + ... + n/1
= n(1/n + 1/(n-1) + ... + 1/1)
= n H_n.
QED.