CS 15-451 Analysis of Algorithms 9 November 2004
Manuel & Avrim Blum
Today: Definition and Theory of NP-completeness... third lecture.
DEFINITION: A decision problem PI (given by INSTANCE and QUESTION) is
NP-COMPLETE iff
1. PI is in NP.
2. PI is complete for all problems in NP. This means that for every
problem PI' in NP there is a poly-time function f from instances of
PI' to instances of PI that maps YES-instances of PI' to YES-instances
of PI and NO-instances of PI' to NO-instances of PI.
For example, suppose there exists a poly-time function f: 3SAT -> PI that maps
YES-instances to YES-instances, and NO-instances to NO-instances. Existence
of f implies that any poly-time algorithm for PI can be turned into a poly-time
algorithm for 3SAT: To solve an instance x of 3SAT, just map x into the
corresponding instance f(x) of PI. Then use the poly-time algorithm for PI
to get the answer for f(x). That answer to f(x) is the desired answer to x.
(Why?)
Such a function f is said to REDUCE 3SAT to PI. This is because f reduces
the problem of solving 3SAT to the problem of solving PI.
Cook's Theorem: 3SAT is NP-COMPLETE.
In what follows, we assume the truth of Cook's Theorem, that
3SAT is NP-complete. That makes it easier to prove that a problem
PI in NP is NP-complete. To do so, we only have to show that
1. PI is in NP, and
2. 3SAT reduces to PI.
Here is an example of how to do this (also due to Cook):
CLIQUE PROBLEM:
INSTANCE: a graph G = (V,E), and a positive integer k.
QUESTION: does G have a clique of size at least k?
A CLIQUE is a collection of vertices such that
every pair of vertices is joined by an edge.
THEOREM: The CLIQUE problem is NP-complete.
PROOF:
1. CLIQUE is in NP. (Why?)
2. 3SAT is poly-time reducible to CLIQUE:
An instance of 3SAT is defined by giving n variables and m clauses,
each clause containing 3 literals, each literal being either a variable or
its complement.
The function f must map each such instance of 3SAT into an instance of
CLIQUE: a pair (G, k), where G = (V,E) is an undirected graph and
k is a positive integer.
This function f must be answer-preserving: it must map YES-instances to
YES-instances and NO-instances to NO-instances. In addition, f must run in
polynomial time. This latter condition on running time is what prevents the
function f from solving the instance of 3SAT. No poly-time algorithm is
known for 3SAT. (It is possible that none exists.) So f must map an instance
of 3SAT for which it does not know the answer to an instance of CLIQUE for
which it does not know the answer -- yet still be answer-preserving.
An important idea for the construction of our particular f is to arrange
that the graph G in the image of f have all possible n(n-1)/2 edges
*except* for a few that we specifically leave out. A missing edge between
two vertices u,v ensures that at most one of these vertices can be in any
clique.
We define f as follows: f maps the n 3SAT variables X1, ..., Xn to n
pairs of *nonadjacent* vertices (x1, x1'), ..., (xn, xn'). f maps each
clause such as {X, Y, Z'} to a triple of *nonadjacent* vertices (x, y, z').
A vertex labelled x, say, and another labelled x', its complement, must be
*nonadjacent*. All edges that have not been specifically forbidden are
inserted into the graph.
This construction ensures that a clique can have no more than n+m
vertices, one from each of the n pairs (x1, x1'), ..., (xn, xn'). and one
from each of the m triples, such as (x, y, z'). In fact, if the clauses
can be satisfied, then such a clique of size m+n can be constructed, and
not otherwise, so set k = m+n.
To complete the proof, we must show 3 things:
1. f maps YES-instances to YES-instances.
2. f maps NO-instances to NO-instances.
3. f is poly-time.
You should be able to do this yourself!
QED
Here is another example, this one due to Karp:
VERTEX COVER PROBLEM:
INSTANCE: an undirected graph G, and a positive integer k.
QUESTION: Does G have a VC (a set of vertices "covering" all the edges)
that has at most k vertices?
THEOREM: The VERTEX COVER problem is NP-complete.
We now have two ways to go about proving this theorem: we can either reduce
3SAT directly to VC, or we can reduce CLIQUE to VC: Suppose we construct a
poly-time function g that reduces CLIQUE to VC, just as the above poly-time
f reduced 3SAT to CLIQUE. Then the composition of g and f, gf, will be a
poly-time reduction of 3SAT to VC:
gf(3SAT instance) = g(CLIQUE instance) = VC instance
NP-completeness is about EXPRESSIVENESS. A reduction of 3SAT to PI is a
proof that PI is as expressive as 3SAT -- as capable as 3SAT at describing
computational problems. But how expressive is 3SAT? There is another
problem that is at least as expressive as 3SAT, and that is:
BOOLEAN FORMULA SATISFIABILITY
INSTANCE: a boolean formula (such as
( ((X -> Y) or (Y -> Z)) and X'Z' )
QUESTION: is the boolean formula satisfiable? In other words, is there a
truth assignment to the variables of the boolean formula that make the
formula true?
In the above example, the assignment of T to each of X,Y,Z is not
satisfying. However, there are many satisfying assignments including any
assignment that maps X -> F.
BOOLEAN FORMULA SATISFIABILITY is clearly NP-Complete.
(Why? Because it is in NP (Why?), and because an instance of 3SAT maps
directly to an instance of BOOLEAN FORMULA SATISFIABILITY as in the
following example:
{X, Y, Z'} {X', Y, Z} -> (X + Y + Z')(X' + Y + Z))