15-854 Approximation and Online Algorithms 02/09/00 * tidying up some loose ends from last time * The max-cut problem and Semidefinite Programming. [Presentations: would like everyone to pick something by 1 wk from today.] [clarification on problem 5: All I'm looking for is a linear time algorithm given that we've already mapped the real numbers to rankings. Getting the rankings from the real numbers can be done in expected linear time w/ 2-level bucket sort] ============================================================================= Last time we went through a 3-approximation to the shortest superstring problem. Just wanted to give an example to clear up one of issues that got confusing in the discussion. Remember the algorithm: we construct the prefix graph and then find the optimal cycle cover. (this is <= opt TSP tour <= opt superstring) For a 4-approx, we just then open up each cycle arbitrarily and concatenate. For a 3-approx, replace "concatenate" with "merge using greedy algorithm". Notice that in the optimal cycle cover, we would never have a cycle like this: a ba b 1------>2------>3----->1 (think of what the strings would have to look like) On the other hand, we could have a cycle like this: ba ba b 1------>2------>3----->1 in which case, string 3 might look like bbababbababbabab Now, suppose this string overlapped in 10 characters (bababbabab) with some string in a different cycle. Then, that other cycle couldn't have weight 5 (else it would be the same as this one). Also, it couldn't have weight < 5 since that would mean our cycle would need to have a subperiod (only possible anyway for non-prime lengths but the point is that then it wouldn't be minimum). In fact, in this case, the shortest possible length is 7: bababba This is basically the reasoning behind the key lemma we needed: in the optimal cycle cover of the prefix graph, if string s1 is in cycle c1, and string s2 is in cycle c2, then overlap(s1,s2) < weight(c1) + weight(c2). ============================================================================== Now, turn to new technique called semidefinite programming. Much like LP relaxed {0,1} values to [0,1] values, this will relax numbers to vectors (or points) in an n-dimensional space. Our optimization problems will end up looking like various kinds of clustering problems. So it helps if you can think in n-dimensional space.... Today will talk about in context of the MAX-CUT problem. ============================================================================== MAX-CUT: Given a graph G, partition vertices into two sets S, T, to maximize the number of edges between S and T. (E.g., if the graph was 2-colorable, then that would give you a perfect cut from this point of view. If not, then this is like asking for the 2-coloring that gets the most edges correct, a lot like MAX 2-SAT. In fact, techniques will carry over to that too....) Here's a natural greedy algorithm: - Start with any arbitrary cut. - If some node has more neighbors on its side than on the other side, then move it to the other side. Repeat. Can you prove this won't just run forever? Halts in O(m) steps. Claim: at the end, at least half of the edges are crossing the cut. Why? So, this is trivially a 1/2 approximation. How about a really simple randomized algorithm? Just put each node on a random side. Every edge has 1/2 prob of crossing cut, so expected number crossing the cut is m/2. Basically, this was the best known for a long time..... Then, [Goemans & Williamson] showed how could use semidefinite programming to do a lot better. ============================================================================= What is Semidefinite programming? Start with an operational definition (what you can do) and then look at what's under the hood. Operational definition: Semidefinite programming is like linear programming, but your variables are vectors, and what you're allowed to write down as constraints are linear inequalities on DOT-PRODUCTS of these vectors. (and you can also maximize or minimize an objective function in this form too) E.g., vectors a,b,c. Constraints: a.a = 1, b.b = 1, c.c = 1. a.b <= 0, b.c <= 0, a.c <= 0. What if we wanted to maximize a.b + b.c + c.a? What if we wanted to minimize it? Notice: we're not allowed to specify that these vectors must live in a 2-dimensional space. So, in general, their span could have as high a dimensionality as the number of vectors. Let's try to use for MAX-CUT. We'll have one variable (vector) for each node in the graph. Let's require them to be unit vectors by saying vi.vi = 1 for all i. Now, we want to put them into two clusters to maximize the number of edges between the clusters. Here's one way we can try to do that: maximize SUM 0.5*(1 - u.v) (u,v) in E E.g., if u=v then it contributes 0 to the sum. If u = -v then it contributes 0.5*2 = 1 to the sum, and if u is perpendicular to v then it contributes 1/2 to the sum. In particular, notice that if we could magically add the constraint "all vectors must lie in a 1-dimensional space" (since they have length 1, this is equivalent to saying that they can either be at +1 or at -1), then our objective function is EXACTLY EQUAL to the number of edges crossing the cut. Unfortunately, we can't. So, much like an LP relaxes {0,1} to fractional values, we are relaxing by allowing an n-dimensional space. So, the SDP might return a "better than optimal" solution according to its objective, by using its freedom. E.g., if the graph is a triangle, then the max cut has value 2. What would the SDP return? (equilateral triangle -> 9/4). The difficulty with SDPs is that we have to somehow "round" these vectors back to boolean values. For the MAX-CUT problem, here's what we'll do: - pick a random hyperplane through the origin. - Let S = set of points on one side, and let T = set of points on the other. Claim: this gives an 0.878-approximation. ========================================================================== MAX-CUT Algorithm: - set up and solve the SDP. - Split into S and T with a random hyperplane through the origin. 2 things to do now. (1) how does this SDP box really work. (2) prove the claim that this gives you an 0.878 approximation. Let's do (2). If time left, get back to (1) at the end. Proof of claim: First of all, given two vectors (u and v) that are separated by an angle alpha, what is the probability they get split by a random hyperplane? Answer: alpha/pi. Why? Important point: intersection of random hyperplane with the 2-d plane defined by u and v looks like a random line (with probability 1). So, can calculate expected value of our solution as a function of all the pairwise angles. E[size of cut] = SUM angle(u,v)/pi (u,v) in E compare this to: SUM 0.5*(1 - u.v) = OPT^* > OPT (u,v) in E So all we need to do is compare item by item. If angle is alpha, then u.v = cos(alpha). Draw graph of alpha/pi for alpha in [0,pi] and compare to 0.5(1 - cos(alpha)). What you get is: For any angle alpha in [0, pi], alpha/pi > 0.878(1 - cos(alpha))/2 So, our solution is at least 0.878 * OPT. ===========================================================================