Notes for 10/20/98 Last time: Relation between eigenvalue gap and rapid mixing. L_2 distance between p^(t) and stationary distribution pi is at most |lambda_2|^t. Also mentioned relation between expansion and eigenvalue gap. Specifically, for an n-node, d-regular graph G, let lambda_2 be the second-largest eigenvalue of the adjacency matrix A(G). Then (we'll look at these next time): Theorem: For all |S| <= n/2, |N(S) - S| >= |S|*(d-lambda_2)/2d Theorem: If for all |S| <= n/2, |N(S) - S| >= c|S| for some c>0, then lambda_2 <= d - c^2/(4 + 2c^2) Today: building expanders and some of their uses. RANDOM LOW-DEGREE GRAPHS ARE EXPANDERS -------------------------------------- [Here is a fixed proof for the degree 3 case] Claim: Say we create a (n,n) bipartite graph via: for each node on the left, we pick a random subset of d=3 nodes on the right to be neighbors. Then, there exists a constant c such that w.h.p, for all sets S on the left with |S| <= n/c, we have |N(S)| >= 1.9*|S|. [Equivalently, can view this as creating an n-node directed graph by giving each node 3 random out-edges.] Proof: For a given value of s, what is the probability that there *exists* a set S of s vertices on the left with <= r neighbors on the right? Let's upper bound this by fixing a set S of s nodes on the left and fixing a set R of r nodes on the right and calculating the prob that all nbrs of S lie inside R, and then multiply this by (n choose s)*(n choose r) Pr(exists such an S) <= (n choose s)(n choose r)(r choose d)^s / (n choose d)^s [now, using (a/b)^b <= (a choose b) <= (ae/b)^b] <= (ne/s)^s (ne/r)^r (re/d)^{ds} / (n/d)^{ds} <= e^{s + r + ds} * r^{ds - r} / ( n^{ds - s - r} s^s ) [now, plugging in d=3, r=1.9s] = e^{6s} r^{1.1s} / ( n^{0.1s} s^s ) = (1.9 e^6 r^{0.1} / n^{0.1})^s We now need to sum this over all values of s. For s=1 this looks like const/n^{0.1}, for s=2 this looks like const/n^{0.2}. The quantity keeps dropping until s starts getting large. We're OK so long as the numerator is sufficiently smaller than the denominator (say less than half of the denominator). This will be fine for sufficiently large c. Applications #1 --------------- Lots of applications for things like routing without congestion, fault-tolerant networks, etc. Here is something Prabhakar and I thought of after business mtg at STOC conference one year: STOC nowadays has parallel sessions. t talks in t/2 time slots. Problem: forces attendees to take sides. Also, if you want to see a random subset of k talks, expect to get a conflict at about k = sqrt(t). But, people don't want to switch back to single-sessions because then there are fewer talks and it becomes too exclusive. How to solve? How to have conference last for only t/2 time slots, but still allow attendees to see all their desired talks (assuming each attendee wants only to see constant fraction c*t of the talks). Answer: Have each person give their talk twice. Actually, let's have each talk given 3 times. Can view as bipartite graph: talks on the left, time slots on the right, t nodes on left, t/2 on right. Given a set of desired talks S, we can figure out how to see those talks by finding a maximum matching over to the right. If we do the talk assignments randomly, then the same calculation as above shows that whp, all sets S of < ct nodes on the left have |N(S)| >= |S|. [Can even do this if you keep in-degree = 6 on the right so have 6 sessions.] By Hall's theorem, this means that for every |S| < c*t on the left, there exists a perfect matching for it. [In paper being handed out, we looked at case of each talk given twice. Not an expander, but "most sets" expand --> for any collection of polynomial # of attendees, with high probability they are all happy] Application #2: amplifying success probability of randomized algorithm without using a lot more random bits. For this application, we'll need the fact that there are deterministic constructions of constant-degree expanders with the property that even if 2^n nodes, can perform walk in poly time per step. (Given the name of a node, can compute quickly who the neighbors are). Gabber-Galil construction ------------------------- |X| = |Y| = n = m^2. Each vertex in X labeled by pair (a,b) for a,b in Z_m. Same for vertices in Y. Degree 5. Neighborhood of (x,y) is: (x+y+1,y) (x+y,y) (x,y) (x,x+y) (x,x+y+1) Specifically, here is result of [Impagliazzo & Zuckerman]: ------------------------------------------------------------ * Have BPP alg using r random bits, with failure prob 1/100. Claim: can decrease failure to (1/100)^k by using only r + O(k) random bits. (in contrast to r*k bits if ran k times independently) * Idea: set up implicit expander graph with one node for each string of length r. Start at random initial position and then do a random walk. Only need constant random bits per step. Sample every $\beta$ steps (ie., run the BPP alg using the name of the current node as its random input) where $\beta$ is defined to make 2nd largest eigenvalue of $R = M^\beta$ at most 1/10. Sample 42k times and take majority vote. If we color nodes ``good'' or ``bad'' depending on whether they cause the BPP algorithm to answer correctly or not, then what we want is for it to be very unlikely that more than half of samples are ``bad'' nodes. * We'd like to say that no matter where you start, after running one step of R, there's at most 1/5 chance of being at a bad node. Can't quite get this. But, get something similar by looking at $L_2$ length. In particular, for any vector p, sqrt(sum of squares of bad entries of pR) <= 1/5 *(L_2 length of p). Proof: say eigenvectors are: $e_1, e_2, ...$ where $e_1 = (1/n,...,1/n)$. All orthogonal. Let $p = x + y$, where $x = c_1e_1$, $y = c_2e_2 + \ldots + c_ne_n$ For convenience, define Z as matrix that zeroes out all the good entries. I.e., Z is identity but where have zeroed out entries for good nodes. So, our goal is to show that ||pRZ|| <= 1/5 * ||p||. Look at x: ||xRZ|| = ||xZ|| <= 1/10 * ||x||. Look at y: ||yRZ|| <= ||yR|| = ||c_2\lambda_2^\beta e_2 + \ldots + c_n\lambda_n^\beta e_n|| <= 1/10 * ||y||. So, ||pRZ|| <= ||xRZ|| + ||yRZ|| (triangle inequality) <= 1/10||x|| + 1/10||y|| <= 1/5 * ||p|| (Note: this also shows that ||pRRZ|| <= 1/5 * ||p||, etc.) Intuitively, if p was "spread out" already, then so is pR, and multiplying by Z is zeroing out a lot of weight of entries. On the other hand, if p is highly concentrated, then multiplying by R by itself is decreasing the L_2 length by spreading out the distribution. * Now, to finish the proof: We want to say it's unlikely more than half the samples are bad. Let's consider a fixed set of t samples, and ask: what's the probability that these are all bad? Claim: if q is our starting distribution, then what we want is the L_1 length of q R R R Z R Z R R Z R Z where the t "Z"s are indexing the t samples we took (there's at least one "R" between any two "Z"s). We'll use the fact that L_1 length <= sqrt(n) * L_2 length. And, L_2 length of (q R R R Z R Z R R Z R Z) <= (1/5)^t * L_2 length of q. And, L_2 length of q is 1/sqrt(n). So, the probability these are all bad is at most (1/5)^t. For half of all 42k samples to be bad we need some set of t > 21k to be bad. At most 2^(42k) such sets. Prob of failure at most 2^(42k) * (1/5)^(21k) = (4/5)^(21k) <= (1/100)^k. QED