[12pt,letterpaper]article

A counting class based on PSPACE that collapses to #P

0.5in

Ryan Williams, '01

We attempt to define a counting class that is slightly more difficult than #P, one that enumerates particular instances of a PSPACE set. However, we find that the new class we define is the same as #P, even though the new class appeared to have functions slightly ``less computable" than #P. Implications and further possibilities are discussed.

Background

In this section, we provide an informal introduction to the complexity classes P, NP, and PSPACE, emphasizing the complete sets CVP, SAT, and QBF.

The complexity classes P, NP, and PSPACE lie at the heart of our notions about which computational problems are feasible and intractable. These classes have arguably have been subject to more intensive research than any other concept in theoretical computer science. The reasons for this are evident when one realizes exactly what P, NP, and PSPACE are. P represents the class of languages (problems with variables) that can be solved by a polynomial time bounded Turing machine (computer program.) NP represents the class of languages whose solutions that can verified by a polynomial time Turing machine. PSPACE represents the class of languages that can be solved by a polynomial space bounded Turing machine.

It is not hard to see that P Í NP Í PSPACE . Intuition tells us that just because the solution(s) to a problem can be verified efficiently, we cannot assume that the solution itself can be found efficiently- take, for example, a complex jigsaw puzzle. Thus most computer scientists believe that P ¹ NP. Also, if one is given an unbounded amount of time yet a bound on space, it seems one should be able to solve more problems with those resources than if one were constrained to the same bound on time, yet unbounded space to work with, since time is required to use the unbounded space. So the overwhelming consensus is that P ¹ PSPACE .

There is also a well-defined notion of ``hardest" languages in P, NP, and PSPACE. Given one of those complexity classes, and a language L (corresponding to some problem P), L is polynomial time (many-one) hard for that class iff there exists a polynomial time function s such that for all other languages L¢ in the class, x Î L¢Û s(x) Î L. In other words, any other problem in the class is just a `special case' of the problem P. We say a language L is complete for a class iff L is hard for that class, and L is a member of that class. So complete problems for a class are the ``hardest" kind of problems in a class.

There are three complete problems we will focus on in this article; each problem is complete for one of P, NP, and PSPACE.

The P-complete problem is the circuit value problem (CVP): ``given a Boolean formula F and some assignment a of its variables, is F true under the assignment a?" The language CVP will be the set of Boolean formulas and assignments such that the above holds.

The corresponding NP-complete problem is satisfiability (SAT): ``given the same F, is there an assignment of variables such that F is true?" The language SAT will be the set of Boolean formulas that have this property.

The corresponding PSPACE-complete problem is quantified boolean formulas (QBF): ``given the same F, with its variables quantified in some manner, is the quantification true?" The task of quantifying the variables can be seen as assigning each variable v of F to one of the following two conditions:

(1) there exists a value of v such that F holds or

(2) for all values of v, F holds.

QBF will be the language containing all pairs of Boolean formulas and their valid quantifications (i.e. quantifications of their variables that make the formula true.)

Counting classes

In this section, we will forego the informal treatment used so far.

Counting classes, or enumeration classes, address the questions ``how difficult is it to count the number of solutions to a given problem?" The elements analyzed in counting classes are not languages; rather, they are functions that represent languages: given an instance, a function returns the number of solutions to that instance for a particular language in another class.

In the below, let S be some alphabet, and let P be some polynomial time predicate.

In the following, a NPTM refers to a nondeterministic polynomial time Turing machine, and a PSM refers to a polynomial space Turing machine. (Note that we do not need to specify the determinism/nondeterminism of the machine, since the two concepts are computationally equivalent for polynomial space.)

The original definition to the following is from [Valiant], yet the below is a latter definition that has become standard.

Definition. #P = {f:S^*® N | ($ NPTM N) f(x) = number of accepting computations of N(x) }

An equivalent formulation is that #P is the set of functions f with the following property: there exists a polynomial time predicate P(x,y) and constant k such that f(x) = number of y, |y| < O(|x|^k), such that P(x,y) = T. (It is easy to see the equivalence of these two formulations, when reminded of what it means for an NPTM N to accept.)

A prime example of a function in #P is the function that, given a Boolean formula F as input, returns the number of truth assignments that satisfy F. Call this function #SAT.

Furthermore, note carefully that for any decision problem (language) L with a k such that for all x Î L and all s Î S_L(x) (the search problem associated with L), |s|£ O(|x|^k) and one can determine that s Î S_L(x) in polynomial time, then the function f(x) that returns the number of solutions to x Î L is a member of #P. So #P contains all ``efficiently computable" counting functions, and by the above result for #SAT, presumably very hard functions to compute.

Note that a function in #P essentially counts the number of witnesses to an instance of an NP-complete problem. There is a notion of completeness for #Pas well: a function f: S^*® N is #P-complete if f Î #P and there exist polynomial time functions p₁,p₂ such that for all g Î #P, g(x) = n Ûf(p₁(x)) = p₂(n). For obvious reasons, we will refer to this kind of reduction as a ``#P reduction," and say g £_#p f. Note it is known that #SAT is complete for #P with respect to #P reductions [Simon].

A natural question one may ask is: after #P, what is the next ``hardest" counting class? We'd like a class that ``counts" something, where each count takes (presumably) slightly more than the resources of NP.We introduce the following candidate for such a class, #PSPACE.

Definition. Let x be an instance of some language in NP. A quantification of the n variables v_i corresponding to x will be regarded as some element of the set {Q₁v₁Q₂v₂¼Q_nv_n| Q_i Î {$,"} }. For simplicity of notation, in the future we will refer to some element of this set as Qx.

Definition. #PSPACE = {f:S^*® N | ($ PTM M) f(x) = number of Qx such that [Qx M(x) accepts] is valid}.

The motivation for calling this a class based on polynomial space is as follows: note that the problem of deciding if a particular Qx is such that [Qx M(x) accepts] is a problem in PSPACE, and in general, is a problem not known to be in any lower complexity class. So we say that in each step of counting we make, we must perform a PSPACE computation (and no less, if PSPACE is not equal to NP). We see a parallel in this definition and the definition for #P, which contains functions that count the number of accepting computation paths, and finding such a path (presumably) requires NP to accomplish each time.

The idea of completeness for #PSPACE will be identical to the definition for #P-completeness given above.

One function that we will see is a member of #PSPACE is the function that, given a Boolean formula F, returns the number of valid quantifications of F's variables. We will (for obvious reasons) call this function #QBF. In the next section we will find that #SAT = #QBF, plus a little more.

#P = #PSPACE

The title of this section says everything. Specifically, we will show that if a function f can enumerate the number of certificates for a particular problem's instances in NP, f also enumerates the number of valid quantifications of each instance's variables. We do this by showing that the #P-complete function #SAT is the same function as #QBF, which will then be shown to be #PSPACE-complete.

Theorem 1: #QBF is complete for #PSPACE, with respect to #P reductions.

Proof. #QBF is in #PSPACE: Let M' be a machine that decides CVP (taking the input (F,x), where F is a formula, x is an assignment), Qx F be an arbitrary QBF. Then: the number of Qx such that (Qx) [M'(F, x) accepts] = number of Qx such that [Qx F] is valid = #QBF. Hence #QBF is in #PSPACE, since M' is a PTM.

Let f be some function in #PSPACE. Then there exists a PTM P¢ such that f(x) = number of Qx such that (Qx) [P'(x) accepts] is true.

Let L' = {x | P'(x) accepts}. Note that L' is in P. Let s be the polynomial time reduction from L' to CVP, so x Î L¢ Û s(x) Î CVP, i.e. P'(x) accepts if and only if s(x) is a formula with a satisfying truth assignment of its variables.

Therefore f(x) = number of Qx such that (Qx) [P'(x) accepts] is valid

= number of Qs(x) such that [Qs(x) s(x)] is a valid QBF

= #QBF(s(x)).

In other words, f(x) = n Û #QBF(s(x)) = n. Setting p₁(x) = s(x), p₂(n) = n, we note that both are polynomial time functions; therefore f £_#p #QBF, and we are done.

Given this, the following fact is quite surprising. Roughly, it says that counting valid quantifications of the variables for a polynomial time predicate requires the same resources as counting the number of variable assignments that satisfy that predicate.

Theorem 2: #QBF = #SAT.

Proof. Let F be a Boolean formula and n be the number of variables. Proof is by induction on n.

Basis: n = 0. There are two formulas with no variables: the canonically true formula (T) and the false formula (^). #SAT(T) = 1 since the satisfying assignment is the ``empty" assignment; #QBF(T) = 1 since a valid quantification is the ``empty" quantification. #SAT(^) = 0 for obvious reasons, and #QBF(^) = 0.

Induction step. Assume #QBF = #SAT for all formulas F with n-1 variables. Let us analyze the output of the two functions for a formula F¢ with n variables- write this as F¢(x₁,¼,x_n-1,x_n). Let F¢_t and F¢_f be F¢ with the nth variable evaluated as true and false, respectively.

First we'll observe a few things about the number of valid quantifications of F¢.

(1) #SAT(F¢_t ÚF¢_f) = #SAT(F¢_t)+#SAT(F¢_f) - #SAT(F¢_t ÙF¢_f) . The proof of this is simply a translation to English: the number of solutions satisfying two Boolean formulas is the number of solutions satisfying each formula separately, minus the number of solutions that satisfy both formulas.

(2) Let Q^kx be some quantification of the first k variables of F¢. Then

Q^n-1x "x_n F¢ = TÛQ^n-1x (F¢_tÙF¢_f) = T and

Q^n-1x $x_n F¢ = TÛQ^n-1x (F¢_tÚF¢_f) = T.

Both of these follow from the properties of quantifiers, and the definitions of F¢_tand F¢_f.

Finally,

#QBF(F¢) = number of Qⁿx F¢ = T such that the nth variable is existential

+ number of Qⁿx F¢ = T such that the nth variable is universal.

So (2) implies #QBF(F¢) = #QBF(F¢_tÙF¢_f)+#QBF(F¢_tÚF¢_f)

= #SAT(F¢_tÙF¢_f) + #SAT(F¢_t ÚF¢_f) by the induction hypothesis (since F¢_tand F¢_f are formulas with n-1 variables).

= #SAT(F¢_t) +#SAT(F¢_f) by (1)

= #SAT(F¢).

Corollary: #P = #PSPACE.

Proof. Due to #SAT=#QBF, #SAT is #P-complete, and #QBF is #PSPACE-complete.

Discussion and Implications

In light of the results, we may regard PSPACE as a class of verification problems, in the same way that P is a class of verification problems (though instances of PSPACE complete problems are presumably much harder to verify.) Recall that the question defining QBF is: "given a Boolean formula F, and a quantification of its variables Qx, is (Qx F) true?" A non-deterministic version of this problem could be stated as: "given F, is there a quantification of its variables Qx such that Qx F is true?" Note that this problem is in NP, since the number of valid quantifications = number of satisfying assignments.

A remark is in order, which could be regarded as an aside. Someone could easily say: ``Of course the problem is in NP- if there exists any valid quantification of variables at all, then the quantification of all existential variables will be valid, which can be verified in polynomial time." To make a point, we may require the quantification of variables to be non-trivial (i.e. some variables are existential, some are universal), yet this modified problem would still be in NP- we need only provide three distinct satisfying assignments of the formula, which would show that there exists at least three (distinct) valid quantifications of the variables.

Furthermore, it is worthy to note that, upon thinking of PSPACE strictly as a class of verification problems, the class #PSPACE is to PSPACE as #P is to P: one class counts the number of solutions that, concatenated with a partial instance of the problem, satisfy some property; the other class determines if a particular candidate satisfies the property.

Stepping back from this mode of thought, an interesting relationship has occurred. The problems in P are such that their solutions are easily verifiable, yet proof that there exists some solution to the problem is considered a hard task. In contrast, for problems in PSPACE, their solutions are extremely hard to verify, yet checking that a solution exists is just as hard as it is for the P problems. In fact, if P is not equal to NP, and NP is not equal to PSPACE, then checking that a solution exists to an instance of a probelm in PSPACE is inherently easier than verifying a solution to a particular instance (i.e. checking that there is a satisfying assignment to a formula is easier than checking any old quantification of the variables of the formula is a valid quantification). And for the problems in P, the case is just the opposite: checking any old solution to the formula is much easier than checking there is a solution.

A more tangible insight can be made concerning the game theory. Suppose we have some two-player game on n variables (call them ``moves"), and there is a system of rules that determines the outcome (i.e. winner and loser) of the game. The moves have an ordering, so that there is an element of time in the game. The system is defined such that if player one plays himself in the game (so that the game is in a ``solitare mode"), one can verify that player one can defeat himself (who is acting as player two) in polynomial time. The above result says that the number of different possible moves player one can use to defeat himself is precisely equal to the number of different ways the moves of the game can be played, yet player one has a winning strategy (in this game, any move has the possibility of being taken by either player one or player two.)

For example, suppose we have a game of two moves such that player one wins if and only if the person who takes the first move chooses true (from the set {true, false}.) There are two sets of moves player one can use to defeat himself in this game, by playing (true, true) or (true, false) (in either case, player one chooses true on the first move.)

Similarly, player one has a winning strategy if and only if he has control of the first move, since if player two had the first move he/she could just choose false and automatically win. So the possible ways that the game can be played so that player one always wins is (player one moves, player two moves) or (player one moves, player one moves). Either way, player one has the first move, so he can always ensure victory.

The point is that we have found a nice relationship between two-player games and their ``solitare" versions: the number of winning strategies for player one is equal to the number of played games such that player one wins. This neat result follows directly from #P = #PSPACE. Obviously, the entire realm of this relationship has not been explored fully, else we would probably know if NP = PSPACE or not.

Conclusion

We attempted to find a counting class slightly more ``difficult" than #P, and found that it collapsed to #P. Further inquiry can be made about what sort of other ``natural" enumeration classes there are, and how they relate to the class #P. Also, due to the #QBF = #SAT lemma, there exists some 1 to 1 correspondence between satisfying variable assignments of a Boolean formula and the valid variable quantifications of that formula. Extended research into the complexity of this interesting correspondence could yield results concerning the questions P = ? PSPACE and NP = ? PSPACE.

References

Cook. ``The complexity of theorem-proving procedures." Proc. 3rd Annual ACM STOC, Association for Computing Machinery, NY, p151-158, 1971.

Garey and Johnson. Computers and Intractability. W.H. Freeman and Company, 1979.

Simon. ``On Some Central Problems in Computational Complexity", Doctoral Thesis, Dept. of Computer Science, Cornell University, Ithaca, NY, 1975.

Valiant. ``The complexity of computing the permanent," Report No. CSR-14-77, Computer Science Dept, University of Edinburgh, Edinburgh, Scotland, 1977.

File translated from T_EX by T_TH, version 2.00.
On 30 Mar 1999, 19:58.