Resistive networks:

 THEOREM: Take a graph G and think of each edge as a unit resistor. 
 Let R_uv = effective resistance between u and v.  Then: 

	 Commute time C_uv = 2m R_uv.

 I.e., put in 2m units of current at u, pull out at v.  Then: C_uv =
 voltage drop.  


 Lemma 1: Say we inject d(x) units of current into each
 node x not equal to v, and remove 2m - d(v) units from v. [Note: is
 this legal?  Batteries set specific voltage, not specific current.
 But, can imagine slowly increasing voltages at each x, until get
 desired current. OK since negative feedback: increase voltage at x,
 only decreases current from y]  Then,

		Hitting time H_xv = V_x - V_v.


 Lemma 2: Say we inject 2m-d(u) amps into u and remove d(x) from each x.
 Then:
		Hitting time H_xu = V_u - V_x.

 Lemma 2 follows from Lemma 1 by negation.

 THEOREM follows from Lemma 1 and Lemma 2:  if we add the two
experiments together (the two sets of voltages), then currents add
too and w eget the experiment of putting 2m units into u and pulling
them out at v.  Voltage drop = (V_u - V_v) in expt1 + (V_u - V_v) in
expt2, which is H_uv + H_vu = C_uv.


Proof of Lemma 1: Clearly true for x=v.  Now lets consider x != v.
	Let's define voltage at v to be 0.

	Let's look at neighbors of x.  Current out of x = d(x).  Also,
	for each neighbor w, current on edge x->w equals (V_x - V_w)/1.
	

	So:    d(x)  =   SUM     [V_x - V_w]   	(N(x) = neighbors of x)
		       w in N(x)

	We can rewrite this as:

		V_x = 1  +   AVG   V_w
                          w in N(x)


	Now, let's look at H_xv.  We can write H_xv in terms of what
	happens after one step, like this:

	H_xv = 1 +  AVG    H_wv
                  w in N(x)


	So, {H_xv} and {V_x} are solutions to the same set of equations.  Now,
	this set of equations has a single solution once we fix V_v = 0 (or
	H_vv = 0).  Can see this by noticing that if you increase the value at
	one node, then either some neighbor must increase by even more, or
	else all neighbors must increase by exactly this amount.

==============================================================================
Random walk on graph is special case of random walk on Markov Chain.
Markov Chains:

 - defn.
	 P_ij = prob of going to j given that you're in state i.

 - e.g., for a graph, if put 1/2 prob of staying put on each step, then
   P_ij = 1/2d(i) if j is a neighbor of i, P_ii= 1/2, rest are 0.

 write a prob distribution as a row vector q.  Then, one step of walk = qP

 If underlying graph (include only edges with non-zero prob) is
 strongly connected, then chain is "irreducible".

 for finite irreducible MC's, "aperiodic" -> for any starting
 distribution, after finitely many steps, have non-zero prob on every state.

 stationary distribution: eigenvector of eigenvalue 1.

 Note: this is largest eigenvalue.  

 One of things we'll want for some algorithms is to show that a random
 walk is rapidly mixing -> E.g., approach stationary in only polylog(n) steps.
 What kinds of graphs / markov-chains give us this property?

=============================================================================

When do we get rapid mixing?

Ans #1: if have gap between largest and 2nd largest eigenvalue.

Ans #2: if graph is expander: for all sets S of <= n/2 nodes, 
		|N(S) - S| > epsilon*|S|.
        I.e., graph doesn't look like a dumbbell.

Next time: nice result of Noga Alon -> equivalence of these two concepts.
Also next time: how to consruct constant degree graphs with this property.
	Punchline: easiest way is just to put in edges randomly, but
	there are also deterministic constructions.  We'll also look
	at some other applications.

Today: keeping with theme of random walks: why eigenvalue gap gives
you rapid mixing, and one algorithmic application of this fact.

Eigenvalue gap -> Rapid Mixing
==============================

Theorem: Say M is a markov chain with real eigenvalues and orthogonal
eigenvectors.  Then, for any starting distribution, the L_2 distance
between q^(t) (the distribution after t steps) and the stationary
distribution $\pi$ is at most |\lambda_2|^t, where \lambda_2 is the
eigenvalue of second-largest absolute value.

So, if we have an eigenvalue gap of some constant $\epsilon$, then
takes only $O((\log n)/\epsilon)$ to get down to $1/n^c$.  

For instance, symmetric matrices have real eigenvalues and orthogonal
eigenvectors. E.g., say $M = (I/2 + A/2d)$ where $A$ is matrix for a
$d$-regular graph. Then $M$ is symmetric, all eigenvalues of $M$ are
nonnegative, and $M$ has eigenvalue gap $\epsilon/2d$ where $\epsilon$
is the eigenvalue gap for $A$.  

In fact, can generalize to ``time-reversible'' Markov chains:
$p_{ij}\pi_i = p_{ji}\pi_j$ for all $i,j$. E.g., random walk on
a graph where nodes are not necessarily all the same degree.


PROOF OF THEOREM: 
Say orthogonal eigenvectors are $e_1, \ldots, e_n$.  $e_1 = \pi$.
Say we start at

	$$q^{(0)} = c_1e_1 + ... + c_ne_n.$$

(Actually, $c_1=1$ since entries in $\pi$ sum to 1 and entries in all
other eigenvectors sum to zero.)
After $t$ steps, we're at:  

    $$q^{(t)} = c_1e_1 + (c_2(\lambda_2)^te_2 +... + c_n (\lambda_n)^te_n).$$

The $L_2$ length of this second part is at most $|\lambda_2|^t$ times
the length of $q^{(0)}$ (and $||q^{(0)}|| \leq 1$).  QED

(which is another way of seeing that $c_1=1$).