Proof Complexity

We have already commented briefly on the fact that proof systems can be evaluated based on provable bounds on the proofs of certain classes of formulae, or by the development of polynomial transformations from proofs in one system into proofs in another.

With regard to the first metric, there are at least three classes of
problems known to be exponentially difficult for conventional
resolution-based provers (including any DPLL implementation):^{10}

- Pigeonhole problems [HakenHaken1985]
- Parity problems [TseitinTseitin1970]
- Clique coloring problems [Bonet, Pitassi, RazBonet et al.1997,Kraj'icekKraj'icek1997,PudlakPudlak1997]

Before turning to a discussion of these problems specifically,
however, let us point out that there are many proof systems that are
known to be more powerful than any of the ones we discuss in this
paper. From our perspective, the most interesting is *extended
resolution* and involves the introduction of new variables that
correspond to arbitrary logical expressions built up out of the
original variables in the theory.

Since such logical expressions can always be built up term-by-term,
it suffices to allow the introduction of new variables corresponding
to pairwise combinations of existing ones; since disjunction can be
replaced with conjunction using de Morgan's laws, it suffices to
introduce new variables of the form

where and are literals in the (possibly already extended) theory and is a new variable. Following this, derivation proceeds using conventional resolution on the augmented theory.

There is no proof system known to be stronger than extended resolution; in fact, there is no class of problems for which there are known to be no polynomially sized proofs in extended resolution.

As we have stressed, however, the fact that a proof system is strong
does not mean that it works well in practice. We know of *no*
implementation of extended resolution for the simple reason that
virtually nothing is known about how to select new variables so as to
shorten proof length.

Understanding *why* the introduction of these new variables can
reduce proof length is considerably simpler. As an example, suppose
that during a resolution proof, we have managed to derive the nogood
, and that we have also derived . In order to
complete the proof, we need to perform lengthy - but identical -
analyses of each of these nogoods, eventually deriving simply from
the first and from the second (and then resolving against
, for example).

If we could replace the pair of nogoods and with the single nogood using (8), the two proofs from and could be collapsed into a single proof, potentially halving the size of the proof in its entirety. Introducing still more variables can repeat the effect, resulting in exponential reductions in proof size.

Another way to look at this is as an improvement in expressivity. There is simply no way to write or the equivalent as a single Boolean axiom in disjunctive normal form. The power of extended resolution rests on the fact that subexpression substitution makes it possible to capture expressions such as in a single axiom.

None of the proof systems being considered in this survey is as powerful as extended resolution, and we will therefore evaluate them based on their performance on the three problems mentioned at the beginning of this section. Let us therefore describe each of those problems in some detail.