15-451 Algorithms 03/03/07
NP-completeness cont
* Recap, Formal definitions.
* reducing CIRCUIT-SAT to 3-SAT [proving 3-SAT is NP-complete]
* reducing 3-SAT to CLIQUE [proving CLIQUE is NP-complete]
* Independent Set, Vertex Cover
=======================================================================
In the discussion below, a "problem" means something like 3-coloring
or network flow, and an "instance" means a specific instance of that
problem --- the graph to color or the network to find a flow on.
A DECISION PROBLEM is just a problem where each instance is either a
YES-instance or a NO-instance and the goal is to decide which type
your instance is. (e.g., for 3-coloring, G is a YES-instance if it has a
3-coloring and NO if not. For the perfect-matching problem, G is a
YES-instance if it has one and a NO-instance if it does not.)
P: class of decision problems Q that have polynomial-time algorithms:
A(I)=YES iff I is a YES-instance of Q.
NP: decision problems where at least the YES-instances have short proofs
(that can be checked in polynomial-time) that the answer is YES. Q is
in NP if there is a verifier V(I,X) such that:
If "I" is a YES-instance, then there exists X such that V(I,X) = YES.
If "I" is a NO-instance, then for all X, V(I,X) = NO.
and furthermore the length of X and the running time of V are poly in |I|.
co-NP: vice-versa: there are short proofs for NO-instances. (E.g.,
given two circuits, C_1, C_2, do they compute the same function?).
Problem Q is NP-complete if:
(1) Q is in NP, and
(2) Any other Q' in NP is "polynomial-time reducible" to Q. In
particular, for any Q' in NP, there is some function f from instances
of Q' to instances of Q, that maps YES-instances of Q' to YES-instances
of Q, and maps NO-instances of Q' to NO-instances of Q.
If Q just satisfies (2) then it's called NP-hard.
Last time we showed that the following problem is NP-complete:
CIRCUIT-SAT: Given a circuit of NAND gates with a single output and no
loops (some of the inputs may be hardwired). Question: is there a
setting of the inputs that causes the circuit to output 1?
Proof sketch: It's clearly in NP, since V(I,X) just runs circuit I on
proposed solution X and checks that the circuit outputs 1. Now, say
Q' is some other problem in NP. Q' must have some polynomial-time
verifier V'. Now, given some instance I of Q', what function f does
is construct a circuit C_I such that C_I(X) = V'(I,X). So, f is like
a compiler that compiles V' into a circuit of NAND gates. Claim is
that we can do this by unrolling a polynomial number of loops of the
kind of NAND-gate computer you built in 15-251. Finally, by design,
C_I is a YES-instance of CIRCUIT-SAT iff I was a YES-instance of Q'.
So, if we can solve CIRCUIT-SAT then we can solve Q'.
Aside: we could define the SEARCH-version of a problem in NP as:
"..and furthermore, if I is a YES-instance, then *produce* X such that
V(I,X)=YES". If we can solve any NP-complete decision problem in
polynomial time then we can actually solve search-version of any
problem in NP in polynomial-time too. Will talk about in recitation.
Now CIRCUIT-SAT is a little unweildy. What's REALLY INTERESTING about
NP-completeness is not just that such problems exist, but that a lot
of very innocuous-looking problems are NP-complete. To show results
like that, we will first reduce CIRCUIT-SAT to a much simpler-looking
problem called 3-SAT.
3-SAT:
Given: a CNF formula (AND of ORs) over n variables x1,...,xn, where
each clause has at most 3 variables in it.
(x1 OR x2 OR not(x3)) AND (not(x2) OR x3) AND (x1 OR x3) AND ...
Goal: find an assignment to the variables that satisfies the formula
if one exists.
THEOREM: 3-SAT is NP-Complete
Proof: We need to define a function f that converts instances C of
CIRCUIT-SAT to instances of 3-SAT such that the formula produced is
satisfiable iff the circuit C had an input x such that C(x)=1.
First of all, let's assume our input is given as a list of gates,
where for each gate we are told what its inputs are connected to.
E.g., g1 = x1 NAND x3; g2 = g1 NAND x4; g3 = x1 NAND 1; g4 = g1 NAND g2; ...
plus we are told which gate gm is the output of the circuit.
We will now compile this into an instance of 3-SAT as follows. We'll
make one variable for each input xi of the circuit, and one for every
gate gi. We now write each NAND as a conjunction of 4 clauses. In
particular, we just replace the statement "y3 = y1 NAND y2" with:
(y1 OR y2 OR y3) // if y1=0 and y2=0 then we must have y3=1
AND (y1 OR not(y2) OR y3) // if y1=0 and y2=1 then we must have y3=1
AND (not(y1) OR y2 OR y3) // if y1=1 and y2=0 then we must have y3=1
AND (not(y1) OR not(y2) OR not(y3)) //if y1=1,y2=1 we must have y3=0
Finally, we add the clause (gm). This forces the circuit to output
1. In other words, we are asking: is there an input and a setting of
all the gates such that the output of the circuit is equal to 1, and
each gate is doing what it's supposed to? So, the 3-CNF formula
produced is satisfiable if and only if the circuit has a setting of
inputs that causes it to output 1. The size of the formula is linear
in the size of the circuit. The construction can be done in
polynomial (actually, linear) time. So, if we had a polynomial-time
algorithm to solve 3-SAT, then we could solve circuit-SAT in poly time
too.
====================================================================
IMPORTANT FACT: now that we have 3-SAT, in order to prove some other
NP problem Q is NP-complete, we just need to show 3-SAT <= Q: if we
could solve Q then we could solve 3-SAT. MAKE SURE YOU UNDERSTAND
THIS - A LOT OF PEOPLE MAKE THE MISTAKE OF DOING IT THE OTHER WAY
AROUND.
MAX-CLIQUE: given a graph G, find the largest clique (set of nodes
s.t. all pairs in the set are neighbors). Decision problem: "Given G
and integer k, is there a clique of size >= k?".
(MAX-CLIQUE is clearly in NP.)
Theorem: Max-Clique is NP-Complete.
Proof: reduce 3-SAT to MAX-CLIQUE.
Given a 3-CNF formula F of m clauses over n variables, we construct a
graph as follows. For each clause c of F we create one node for every
assignment to variables in c that satisfies c. E.g., say
F = (x1 OR x2 OR not(x4)) AND (not(x3) OR x4) AND (not(x2) OR not(x3)) AND ...
Then in this case we would create nodes like this:
(x1=0,x2=0,x4=0) (x3=0,x4=0) (x2=0,x3=0) ...
(x1=0,x2=1,x4=0) (x3=0,x4=1) (x2=0,x3=1)
(x1=0,x2=1,x4=1) (x3=1,x4=1) (x2=1,x3=0)
...
Then we put an edge between two nodes if the partial assignments are
consistent. Note: max possible clique size is m. And, if the 3-SAT
problem does have a satisfying assignment, then in fact there IS an
m-clique. Claim is this is true in the other direction too. If the
graph has an m-clique, then there is a satisfying assignment: namely,
just read off the assignment given in the nodes of the clique. So,
this graph has a clique of size m iff F was satisfiable. Also, our
reduction is poly time since the total size of graph is
at most quadratic in size of formula (O(m) nodes, O(m^2) edges).
Therefore Max-Clique is NP-complete.
Independent Set
===============
An independent set in a graph is a set of nodes no two of which have
an edge. E.g., in a 7-cycle, the largest independent set has size 3.
(E.g, in the graph coloring problem, the set of nodes colored red is
an independent set).
Theorem: Independent set (is there an Indep Set in G of size > k?) is
NP-complete.
Proof: Reduce from clique. Given graph G for clique problem, just
take complement of the graph . Ie. create graph H such that H has
edge (u,v) iff G does NOT have edge (u,v). Then H has an indep set of
size k iff G has a k-clique.
Vertex-Cover
============
A vertex cover in a graph is a set of nodes such that every edge is
incident to at least one. (e.g., look at cut-diamond graph). For
instance, can think of as what rooms to put security guards in a
museum. What we want is the smallest vertex cover.
Decision problem: does there exist a vertex cover of size < k?
Claim: if C is a vertex cover in a graph, then V-C is an independent
set. Also if S is an independent set, then V-S is a vertex cover.
So, to solve "is there an independent set of size > k?" just solve "is
there a vertex cover of size < n-k?". So, Vertex cover is NP-Complete
too.