Notes for 1/15/97 15-852 Randomized Algorithms Today: 2 important ideas: * k-wise independence. -- Computing average while preserving secrecry * linearity of expectation -- Gambler's odds problem -- analyzing randomized quicksort ============================================================================= FINDING THE AVERAGE GRADE, WITHOUT GIVING AWAY YOUR OWN SCORE ------------------------------------------------------------- * The problem: A group of N friends wants to determine their average score on a test, but nobody wants to reveal their own score. * Even if K of the N "friends" are dishonest, then they should still learn no more than the sum of the remaining N-K honest ones (which they can figure out anyway from the answer). [in a sec we'll get back to the question of what it means to say that the protocol leaks no additional informatgion]. Here's a protocol: We'll compute the sum mod M. M is some sufficiently large number greater than the sum. E.g., M=101*N. Say player i has secret S. Player i chooses N-1 numbers X_1, X_2, ..., X_{N-1} independently at random in the range 0,...,M-1 and writes them each on a piece of paper. Then he chooses Y in this range so that X_1 + ... + X_{N-1} + Y = S mod M, and writes Y on a piece of paper. Player i then distributes N-1 of these to the others (one per person) and keeps one. Then, player i takes the number he kept, plus all of the others he received, and adds them up (mod M), and announces the result. Finally, all players add up the announced results (mod M), which is the desired sum. Analysis: Can see that the sum is right by writing out the matrix of values. But, how do we show that this is secure? First some definitions: Two events A and B are INDEPENDENT if Pr[A and B] = Pr[A] * Pr[B]. Easier way to think of it: Pr[A | B] = Pr[A]. More generally a collection of events are independent if for any subset, the probability of their intersection is the product of their probabilities. A collection of events is K-WISE INDEPENDENT if any subset of K of them is independent. Similarly, random variables X, Y are independent if for each x,y we have events X=x and Y=y independent. K-wise independence is defined in the same way as with events. Some useful facts: Say X is a random variable uniformly distributed in S = {0, 1, ... , m-1}. Then, clearly for any fixed a, the variable Y = X + a (mod m) is also uniform in S (though X and Y are dependent). If X_1, ..., X_k are independent r.v's uniform in S and Y is defined to be a - (X_1 + ... X_k) (mod m) for some fixed a, then these k+1 r.v.s are k-wise independent. Just check that Pr[Y = y | X_2 = x_2, X_3 = x_3, ..., X_k = x_k] = 1/m. Proof for protocol: Claim is secure in following sense: for any set of t honest players, only information that the rest of players have, even if they cheat, is the sum of the t players' values. What does this really mean? What is "information"? What we mean is that if you modify the values for these players keeping the sum the same, then the probability distribution on the sequence of outputs produced by those players doesn't change. Proof: consider the t all sent their Y value to the same honest player. (Doesn't change protocol). For any fixed sequence of coin tosses, all sets of t values with the same sum lead to exactly the same information revealed to the other players. So, all sets with the same sum have the same distribution. GAMBLER'S ODDS PROBLEM ---------------------- Say you go into casino. Say has fair games: at any time t can bet some amount and for even money you have 50/50 chance of winning. Say you have some strategy - determines how much to bet and when and when to leave, but must leave by end of the day. What's best strategy for highest expected amount by the end of the day? Answer: it doesn't matter. All strategies result in same expected amount. * Defn of expectation: E[X] = SUM_a[a * Pr(X=a)]. * Let's define the random variable X_i to be the gain at time i. If X is the random variable denoting the amount we win by the end of the day, then X = X_1 + X_2 + ... + X_N, where N=#seconds in a day. Note that these X_i's could be horribly dependent (e.g., depending on whether you won or lost in your last bet, you might bet more or less). Nonetheless, if we just look at one of the X_i, it's easy to see that E[X_i] = 0. For instance: let p_ij be the apriori probability that you will bet j dollars at time at time i (which might be horrible to calculate, but that's ok, we don't need to). So, E[X_i] = SUM_j[j * p_ij / 2 + (-j) * p_ij / 2] = 0. Now, use linearity of expectation: E[X] = E[X_1 + ... + X_N] = E[X_1] + ... + E[X_N] = 0. (Lesson: simplify by setting up simple random variables and using linearity of expectation). Similar problem: Say there is a stock that's at some price x. Each day, it goes up or down by $1, indep of past, unless it's at 0 in which case it stays there. You have some amount of money m. Each day you can by or sell as much as you want, until when you die your money is converted back into cash. Is there a strategy such that the expected amout of money you will have at death is more than what you started with? No. LINEARITY OF EXPECTATION ------------------------ Want to show that for 2 random variables X & Y, E[X+Y] = E[X] + E[Y] even if X and Y are dependent. Let's assume X and Y take on discrete values so that can use summations instead of integrals. E[X+Y] = SUM_a [ a * Pr(X+Y = a) ] = SUM_(b,c) [ (b+c) * Pr(X = b and Y = c) ] (just splitting the event that the sum is "a" into all the (disjoint) ways that this could occur) = SUM_b SUM_c [ b * Pr(X=b and Y=c)] + SUM_c SUM_b [ c * Pr(X=b and Y=c)] = SUM_b [b * Pr(X=b)] + SUM_c [c * Pr(Y=c)] RANDOMIZED QUICKSORT -------------------- --> describe algorithm. (assume no equal values for simplicity) --> what is the expected number of comparisons? Define random variable let e_1 be the smallest element, e_n be the largest. Define random variable X_ij = 1 if e_i is compared with e_j in running the algorithm. So, expected number of comparisons is just SUM_{i1, then with prob 2/3, Y is X/3 and with prob 1/3, Y is 3X). So the expected gain of "flip" is 2/3 * (-2) + 1/3 * 6 = 2/3 > 0. So you should flip. But, we are saying that no matter what you see, you should flip. So, might as well flip before observing anything. But, this is just like not flipping since you were shown a random one of the two sides.... So, what gives.....?