Notes for 02/05 15-852 Randomized Algorithms * random walks on graphs Chap 6.1, 6.3, 6.4, 6.5 ============================================================================ Turn to new topic: random walks on graphs. Lots of applications: - approximation/estimation problems: estimating permanent, volumes - random selection: pick a random consistent linear separator. - derandomizing randomized algs. Plus, it's neat by itself. Say you're lost in a maze and don't have good memory. How long will it take to get out if you just move around randomly? Random walk on a graph: Start at some initial vertex. Pick neighbor at random and go there, etc. Equivalently, think of as currently on some edge heading in some direction. Then move to random neighboring edge in forward direction. Interesting quantities to look at: H_uv = "hitting time" from u to v: E[#steps to first reach v | start at u] C_uv = "commute time: E[# steps to go from u to v then back to u] C_uv = H_uv + H_vu. Why? (linearity of expectation: X = # steps to first hit v. Y = # steps to hit u after first hit v.) C_u = "cover time from u" is the expected time to visit entire graph when we start at u. C_G = "cover time of G" = max_u ( C_u ) THEOREM: for all connected G, C_G <= 2m(n-1). m=#edges, n=#vertices. We will prove this in two ways. Implication: s-t connectivity in "randomized log space". Only have O(log n) bits of read/write memory, and required to halt in poly time, can determine connectivity with 1-sided error. ------------------------------------------------------------------------------ Method 1: [Aleliunas, Karp, Lipton, Lovasz, Rackoff] Let's fix some arbitrary starting vertex for our walk. Look at ratio: (# visits to u in first t steps of random walk)/t. Claim: with probability 1, as t->infinity, this approaches a limiting value. We'll call that value pi_u. Can think of pi_u as "steady-state prob of being at u". E.g., if graph is: u-----v then we're at u half of the time. Proof: Let X_i denote the time between the ith visit to u and the (i+1)st visit to u. E[X_i] = H_uu, which is finite. Also, finite variance. Implies: (1) As t->infinity, the number of visits to u goes to infinity too, with probability 1. (2) Can apply "law of large numbers": As s->infinity, (X_1 + ... + X_s)/s --> H_uu with probability 1. Now, notice that numerator is t, denominator is number of visits to u in t steps. So, pi_u = 1/H_uu. In other words, what you expect to happen happens. -----u-------u----u-----------u-----------u----u-------> H_uu = average distance between marks = 1/(limiting frequency of being at u). FACT: sum_u pi_u = 1. (You're always somewhere). Lemma: pi_u/d(u) is the same for all nodes u. (d(u) is the degree of u). I.e., think of the walk as being on edges. Then, this lemma says that all edges are equally frequent, no matter what the graph! Proof: Let pi(v,t) = prob at v at time t. Say u has neighbors v1, ..., vd. So, pi(u,t) = pi(v1,t-1)/d(v1) + pi(v2,t-1)/d(v2) + ... + pi(vd,t-1)/d(vd). Since pi_v is the limiting value of the average of the pi(v,t), we can remove the t's. Also, let's divide both sides by d(u) pi_u / d(u) = AVG[pi_v1 / d(v1), ... ,pi_vd / d(vd)]. ==> everybody is the average of all their neighbors. ==> all equal. (consider working outward from the maximum) ==> all are equal to 1/2m. i.e., pi_u = d(u)/2m. Where are we now: we have that the average time between successive traversals of some edge u-->v is 2m. Corollary: for any edge (u,v), C_uv <= 2m. Proof: E[time to go to v and back to u | at u (and just came from v)] <= E[time to traverse (v,u) | on (v,u)] = 2m. --------------------------------------------------------------------- Leemon gave an interesting alternative proof to the above: imagine we begin by picking a random edge and direction to start our random walk (i.e., we start in the distribution specified in the Lemma). Then, notice this distribution is a fixed point: after 1 step, we remain with equal probability on each edge in each direction. So, over a sequence of t steps, the expected number of times we traverse edge u-->v is t/2m. Now, we just want to invert this to say that the expected time between two consecutive traversals of u-->v is 2m. We can do this using the earlier idea of setting a random variable X_i to denote the time between the ith and (i+1)st traversals, and saying that this has finite expectation and variance, so long as the graph is connected. (If the graph is disconnected, then X_0 has infinite expectation). So, as t->infinity, the number of traversals of u-->v goes to infinity too, and we're ok. --------------------------------------------------------------------- To finish the proof about the cover time of graph: Let's consider some spanning tree of the graph. Has n-1 edges. Consider some fixed tour on the spanning tree. --> traverses each edge twice. So, E[time to visit nodes in that order] = sum, over all directed edges (u,v) in the tree, of H_uv. For each {(u,v), (v,u)}, get H_uv + H_vu = C_uv <= 2m. Since n-1 edges, we get: 2m(n-1). =========================================================================== Proof 2. Something different: resistors. Resistive network is a graph with resistors on the edges. E.g., *-----2 Ohms------* | | 1 Ohm 2 Ohms | | *-----3 Ohms------* Kirchoff's law: current is like water flow. For any node, sum entering equal sum leaving. Ohm's law: V=IR. (V = voltage difference across resistor). Voltage is a property of a node. Sometimes called "potential". Two resistors in series add resistance: *-------R1--------*-------R2-------* V0 V1 V2 Say I = current flowing from left to right. Then V1 = V_0 - I*R_1. V2 = V0 - I*R1 - I*R2 ==> delta(V) = I(R1 + R2). _____R1______ / \ Two resistors in parallel: V0* *V1 \_____R2______/ 1/R = 1/R1 + 1/R2. See via: I1 = (V0 - V1)/R1, I2 = (V0-V1)/R2. and I=I1+I2. Neat fact: THEOREM: Take a graph G and think of each edge as a unit resistor. Let R_uv = effective resistance between u and v. Then: C_uv = 2m R_uv. We'll continue on Monday...