Notes for 01/29 15-852 Randomized Algorithms * Occupancy problems 3.1, 3.6, 2.2.2, skim appendices * Lower bounds for randomized algorithms. * Spencer's graduation/tenure game. Think about presentations you might want to give. E.g., after this lecture, I'm not going to do any more on Chaps 1-4 (except maybe 2-player games). These chapters have several neat algorithms - routing, wiring - that might make a good presentation. Or, interesting algs from recent conferences -- I can suggest some nice ones. ============================================================================== Occupancy problems: ------------------ Say we put m balls randomly into n boxes: what can we say about the distribution that comes up? Motivation behind universal hashing. Let's look at some things we can say, and also how much independence we are using to say them. Let m=n for simplicity. Let B_i be the size of the ith bin. X_j = indicator for ball j falling into ith bin. What is E[B_i]? 1 What is Var[B_i]? 1 - 1/n -> Var[B_i] = sum_j Var[X_j] = n*(1/n - 1/n^2) = 1 - 1/n. What does a single bin look like, distribution of bins? Pr[get zero balls] = (1 - 1/n)^n = (approx) 1/e -> expect 1/e fraction of bins to be empty Pr[exactly 1 ball] = n(1/n)(1-1/n)^{n-1} = (approx) 1/e Pr[exactly 2 balls]= (n choose 2)(1/n^2)(1-1/n)^{n-2} = (approx) 1/2e Pr[get exactly i balls]=(n choose i)(1/n)^i(1-1/n)^{n-i} <= 1/(e*i!) What can we say about largest bin? 1/(e*i!) = 1/n^c at i = O(log(n)/loglog(n)). So, with high probability, largest bin has size at most O(log(n)/loglog(n)). In fact, we expect some bin to be that full. (pick i so have 1/n chance that bin has >= i balls. Prob that bin i is at least this full given that previous bins are not this full is only larger. -> martingales) -> Note: this argument uses full independence. What if we only have pairwise independence. What can we say about size of largest bin? Chebyshev: sigma is about 1, so Pr[bin size < t] < 1/t^2. Expected maximum is at most O(sqrt(n)). Q: Can this happen, say with matrix-based hashing? Say we hash all of {0,1}^n into {0,1}^n, using a random n x n matrix??? Coupon-collector's problem -------------------------- What is the expected number of balls until to fill all bins? 1 way to analyze: likely all filled after n*ln(n) + cn: prob a given bin is empty after k*n balls is 1/e^k, so at k = ln(n) + c, this is 1/n^c. or, let X_i be the time to fill a new bin when there are i empty ones remaining. Total time X = sum_i X_i. Each new ball as i/n chance of filling a new bin, so X_i is geometrically distributed and E[X_i] = n/i. So, expected time is Sum_i n/i = n*H_n. Note: at this point, w.h.p., largest bin has only O(log n) balls. Lower bounds for randomized algorithms ====================================== Consider some problem we want to solve, like sorting. What does it mean to give a lower bound on performance? for all alg A, exists input I s.t. C(A,I) > b. Can think of as game between algorithm designer and adversary, where alg has to go first. Ex. "comparison-based sorting": output is some permutation of input, and only access to input is via comparing two values in input. Theorem: any deterministic comparison-based sorting alg must perform Omega(n log n) comparisons. Proof: If there exists two different permutations of input set that yeild the same sequence of results to the comparisons made, then sorter fails. 1. There are n! different possible permutations to start with. 2. Each comparison splits set of still-possible permutations into two pieces. Pick outcome corresponding to larger piece. 3. Size of set drops by at most a factor of 2 each time, and sorter can't finish until size=1. So number of steps is at least lg(n!) = Omega(n log n). Specific input is one following this path. What about randomized algorithms? View randomized alg as a distribution on deterministic algs A(x,y). Then we want to say that for any distribution on deterministic algs, there exist an input such that expected cost is high. 1) for all dist Q, exists I such that E[C(A_Q, I)] > b Often easiest way to do this is to have adversary go first, and be randomized, and assume alg is deterministic. That is, to show: 2) exists dist P such that for all A, E[C(A,I_P)] > b Can you see why this is sufficient? -> look at E[C(A_Q, I_P)], for some Q and for P of (2). (2) says this must be larger than b, since its larger for all A, and if for some Q this is larger than b, then clearly exists I s.t. it's larger than b. (Using fact that if E[X] > y, then exists legal value of X that is > y) In fact, in the case where algs=circuits, so have finite number, and any distrib is OK, then min-max theorem of game theory says max_P min_A E[C(A, I_P)] == min_Q max_I E[C(A_Q, I)] adv goes first alg goes first Apply to sorting: adversary picks random permutation. So, asking: what is a lower bound on the depth of a random leaf in a binary tree of n! leaves. At most 1/2 of leaves at depth less than lg(n!), so average depth is at least 1/2 of that. Apply to hashing ---------------- Theorem: for any family H of functions h: M-->N, (|M| = m, |N| = n), there exist x != y in M s.t. Prob[ h(x) = h(y) ] >= 1/n - 1/m. h<-H Proof: show that for any h, for random x!=y, Prob[ h(x) = h(y) ] >= 1/n - 1/m. x!=y B_j = number of points in M that map to j. Number of ordered pairs (x != y) that collide is: Sum_j B_j(B_j - 1) = Sum_j(B_j)^2 - m. Minimum occurs when all B_j are equal. Why? No randomness, but consider random variable X = B_j, j=1,...,n with equal probability. Variance = Avg[(B_j)^2] - Avg[B_j]^2 Notice that variance can't be negative. So, Avg[(B_j)^2] >= (m/n)^2. Minimized when all equal (variance=0). So, Sum_j B_j(B_j - 1) >= n(m/n)^2 - m = m^2(1/n - 1/m). There are m(m-1) ordered pairs x,y, so the probability that a random pair collides is at least: [m^2/(m(m-1))] * (1/n - 1/m) >= 1/n - 1/m. Spencer's "graduation game" --------------------------- Begin with some set of people at various positions. 2 players: Paul (partitioner) splits people into two groups Carole (chooser) selects one group to be removed, other advances. Start: people at various years until graduation. Paul wins if somebody graduates, Carole wins if nobody graduates. [Note: similar game: 20 questions with k lies allowed. questioner splits the set, and the chooser gives answer. Those that do not agree with answer move up one. position of element corresponds to # of lies if element is the correct one. questioner's goal is to reduce the number of elements below the k-boundary to 1 as quickly as possible.] Question: given a starting configuration, who wins? Note: this is a full information 2-player game. Let's say that at the start, there are a_k people at distance k from graduation. Theorem 1: If sum_k a_k / 2^k < 1, then Carole has a winning strategy. Theorem 2: If sum_k a_k / 2^k >= 1, then Paul has a winning strategy. Proof of theorem 1. Suppose Carole plays randomly. What is the expected number of people that reach the goal? For instance, for fixed person who is initially at level k, what is the probability that this person reaches the goal? Therefore, no matter what Paul's strategy is, a random Carole has a non-zero chance of winning. Therefore, Paul cannot have a winning strategy and so Carole does. Actually, we can figure out Carole's winning strategy. Any ideas?