15-451 Algorithms 11/11/09
recitation notes
* NP-completeness reductions contd.
* Approximation algorithm for MAX-EXACTLY-3-SAT
====================================================================
More NP-completeness reductions
===============================
The practice quiz talked about SET-SPLITTING. Given a collection of
sets over n points {1,2,...,n} we want to color the points red or blue
so that each set gets at least one point of each color. This turns
out to be NP-complete. Let's prove it by reduction from 3-SAT.
Given a 3-SAT formula F, create a SET-SPLITTING instance as follows:
- one point for each variable x_i and one point for each "not(x_i)".
- for each i, make the set {x_i, not(x_i)}. This will force them to
be different colors. (One color will represent "true" and the
other will represent "false".
- one additional point "f" whose color we will interpret as "false".
- then, for each clause c in the given 3-SAT formula, we create a set
of size 4 containing the literals in c plus the point f.
Claim 1: if the 3-SAT formula was satisfiable then the set-splitting
instance is solvable too. Proof: use the two colors "true" and
"false". Color each literal according to the setting of some
satisfying assignment, and color "f" false. Each set has at least one
true and at least one false (namely f) so it's a yes-instance too.
Claim 2: vice-versa. Just interpret the color of f as false and the
other color as true and read off the colors as the assignment. Each
clause must have at least one literal set to true, and we never set
literals inconsistently because of the sets {x_i, not(x_i)}
Now turning to approximation algorithms...
MAX-EXACTLY-3-SAT
=================
Suppose we have a 3-SAT formula where every clause has size exactly 3
with no duplicate variables per clause allowed (e.g., can't have (x_1
v x_1 v x_1) or (x_1 v not(x_1))). Then there is a simple randomized
algorithm that can satisfy at least a 7/8 fraction of clauses in
expectation. Can anyone see what it is?
[just try a random assignment to the variables]
How about a deterministic algorithm? Here's a nice way we can do that.
First, let's generalize the above statement to talk about general CNF
formulas.
Claim: Suppose we have a CNF formula of m clauses, with m_1 clauses of
size 1, m_2 of size 2, etc. (m = m_1 + m_2 + ...). Again, no
duplicate variables allowed in the same clause. Then a random
assignment satisfies sum_k m_k(1 - 1/2^k) clauses.
Proof: Can anyone see a proof? [linearity of expectation.]
Now, here is a deterministic algorithm : Look at x_1: for each of the
two possible settings (0 or 1) we can calculate the expected number of
clauses satisfied if we were to go with that setting, and then set the
rest of the variables randomly. (It is just the number of clauses
already satisfied plus sum_k m_k(1-1/2^k), where m_k is the number of
clauses of size k in the ``formula to go''.) Fix x_1 to the setting
that gives us a larger expectation. Now go on to x_2 and do the same
thing, setting it to the value with the highest expectation-to-go, and
then x_3 and so on. The point is that since we always pick the
setting whose expectation-to-go is larger, this expectation-to-go
never decreases (since our current expectation is the average of the
ones we get by setting the next variable to 0 or 1).
This is called the ``conditional expectation'' method. The algorithm
itself is completely deterministic --- in fact we could rewrite
it to get rid of any hint of randomization by just viewing sum_k
m_k(1-1/2^k) as a way of weighting the clauses to favor the small
ones, but our motivation was based on the randomized method.
Interesting fact: getting a 7/8 + epsilon approximation for any
constant epsilon (like .001) for MAX-exactly-3-SAT is NP-hard!