15-451 11/09/00 * More NP-complete problems * Other complexity classes =========================================================================== In the discussion below, a "problem" means something like 3-coloring or network flow, and an "instance" means a specific instance of that problem --- the graph to color or the network to find a flow on. NP: problem Q is in NP if there is a verifier V(w,I) that runs in time poly(|I|) such that: If "I" is a YES-instance, then there exists w such that V(w,I) = 1. If "I" is a NO-instance, then for all w, V(w,I) = 0. E.g., 3-coloring. The instance is the graph and the witness is the coloring. How about TSP: "Given a graph G and a distance d, does there exist a tour that visits all nodes and has length <= d?" (aside: can anyone see how to use a black box that solves this to actually *find* the tour?) Note: can actually assume V is linear-time since can pad the instance with 0's. Problem Q is NP-complete if: (1) Q is in NP, and (2) Any other problem Q' in NP is polynomial-time reducible to Q. If Q just satisfies (2) then it's called NP-hard. In some sense, the canonical NP-complete problem is "given a verifier V (written in some standard programming language), and an instance I, and a polynomial time-bound B, does there exist w such that V(w,I) halts in time B and outputs 1?" (We can enforce that B is poly(|I|) by requiring it to be written in unary.) Theorem from last time: 3-SAT and Circuit-SAT are NP-complete. IMPORTANT FACT: now that we have 3-SAT, in order to prove some other NP problem Q is NP-complete, we just need to show 3-SAT <= Q: if we could solve Q then we could solve 3-SAT. MAKE SURE YOU UNDERSTAND THIS - A LOT OF PEOPLE MAKE THE MISTAKE OF DOING IT THE OTHER WAY AROUND. MAX-CLIQUE: given a graph G, find the largest clique (set of nodes s.t. all pairs in the set are neighbors). Decision problem: "Given G and integer k, is there a clique of size >= k?". Theorem: Max-Clique is NP-hard. Proof: reduce 3-SAT to MAX-CLIQUE. Given a 3-CNF formula F of m clauses over n variables, we construct a graph as follows. For each clause c of F we create one node for every assignment to variables in c that satisfies c. E.g., say F = (x1 OR x2 OR not(x4)) AND (not(x3) OR x4) AND (not(x2) OR not(x3)) AND ... Then in this case we would create nodes like this: (x1=0,x2=0,x4=0) (x3=0,x4=0) (x2=0,x3=0) ... (x1=0,x2=1,x4=0) (x3=0,x4=1) (x2=0,x3=1) (x1=0,x2=1,x4=1) (x3=1,x4=1) (x2=1,x3=0) ... Note that there are at most 7m nodes. Then we put an edge between two nodes if the partial assignments are consistent. Note: max possible clique size is m. And, if the 3-SaT problem does have a satisfying assignment, then in fact there IS an m-clique. Claim is this is true in the other direction too. If the graph has an m-clique, then there is a satisfying assignment: namely, just read off the assignment given in the nodes of the clique. So, Max-Clique is NP-complete too. Independent Set =============== An independent set in a graph is a set of nodes no two of which have an edge. E.g., in a 7-cycle, the largest independent set has size 3. (E.g, in the graph coloring problem, the set of nodes colored red is an independent set). Theorem: Independent set (is there an Indep Set in G of size > k?) is NP-complete. Proof: Reduce from clique. Given graph G for clique problem, just take complement of the graph . Ie. create graph H such that H has edge (u,v) iff G does NOT have edge (u,v). Then H has an indep set of size k iff G has a k-clique. Vertex-Cover ============ A vertex cover in a graph is a set of nodes such that every edge is incident to at least one. (e.g., look at cut-diamond graph). For instance, can think of as what rooms to put security guards in a museum. What we want is the smallest vertex cover. Decision problem: does there exist a vertex cover of size < k? Claim: if C is a vertex cover in a graph, then V-C is an independent set. Also if S is an independent set, then V-S is a vertex cover. So, to solve "is there an independent set of size > k?" just solve "is there a vertex cover of size < n-k?". So, Vertex cover is NP-Complete too. ------------------------------------------------------------------- More complexity classes: in all these, assume w is a string of poly(|I|) bits. NP: Q is in NP if there is a polynomial-time algorithm V(w,I) such that: If "I" is a YES-instance, then there exists w such that V(w,I) = 1. If "I" is a NO-instance, then for all w, V(w,I) = 0. Co-NP: other way around from NP: swap YES and NO. RP: randomized polynomial time If "I" is a YES-instance, then for at least half of w, V(w,I) = 1. If "I" is a NO-instance, then for all w, V(w,I) = 0. Co-RP: the other way around. E.g., will see later that primality is in Co-RP: there are efficient algs such that if N is prime they always output YES, and if N is composite then with prob >= 1/2 they output NO. Can run many times to amplify. (If ever see a NO, you know it's not prime. If always see YES, then probably it's prime) BPP: two-sided error If "I" is a YES-instance, then for at least 2/3 of w, V(w,I) = 1. If "I" is a NO-instance, then for at least 2/3 of w, V(w,I) = 0. What would a problem that's not even in NP look like? Here's a problem that's believed to not be in NP: For the n by n version of chess, does the first player have a winning strategy? (The answer is not known for the usual 8x8 chess). Might be true --- but even if you knew it was true, it's not clear how you could give a short proof.