# 8 Theorem Prover

Disclaimer: The theorem proving component of Twelf is in an even more experimental stage and currently under active development. There are two main restrictions which limit its utility: (1) it only support reasoning about closed objects, and (2) it cannot apply lemmas automatically.

Nonetheless, it can prove a number of interesting examples automatically which illustrate our approach the meta-theorem proving which is described in Schuermann and Pfenning 1998, CADE. These examples include type preservation for Mini-ML, one direction of compiler correctness for different abstract machines, soundness and completeness for logic programming interpreters, and the deduction theorem for Hilbert's formulation of propositional logic. These and other examples can be found in the example directories of the Twelf distribution (see section 13 Examples).

A theorem in Twelf is, properly speaking, a meta-theorem: it expresses a property of objects constructed over a fixed LF signature. Theorems are stated in the meta-logic M2 whose quantifiers range over LF objects. In the simplest case, we may just be asserting the existence of an LF object of a given type. This only requires direct search for a proof term, using methods inspired by logic programming. More generally, we may claim and prove forall/exists statements which allow us to express meta-theorems which require structural induction, such as type preservation under evaluation in a simple functional language (see section 5.6 Sample Program).

## 8.1 Theorem Declaration

There are two forms of declarations related to the proving of theorems and meta-theorems. The first, `%theorem`, states a theorem as a meta-formula (`mform`) in the meta-logic M2 defined below. The second, `%prove`, gives a resource bound, a theorem, and an induction ordering and asks Twelf to search for a proof.

Note that a well-typed `%theorem` declaration always succeeds, while the `%prove` declaration only succeeds if Twelf can find a proof.

```dec ::= {id:term}         % x:A
| {id}              % x

decs ::= dec
| dec decs

mform ::= forall* decs mform % implicit universal
| forall decs mform  % universal
| exists decs mform  % existential
| true               % truth

thdecl ::= id : mform        % theorem name a, spec

pdecl ::= nat order callpats % bound, induction order, theorems

decl ::= ...
| %theorem thdecl.  % theorem declaration
| %prove pdecl.     % prove declaration
```

The prover only accepts quantifier alternations of the form `forall* decs forall decs exists decs true`. Note that the implicit quantifiers (which will be suppressed when printing the proof terms) must all be collected in front.

The syntax and meaning of `order` and `callpats` are explained in section 7 Termination, since they are also critical notions in the simpler termination checker.

## 8.2 Sample Theorems

As a first example, we use the theorem prover to establish a simple theorem in first-order logic (namely that A implies A for any proposition A), using the signature for natural deduction (see section 3.6 Sample Signature).

```%theorem
trivI : exists {D:{A:o} nd (A imp A)}
true.

%prove 2 {} (trivI D).
```

The empty termination ordering `{}` instructs Twelf not to use induction to prove the theorem. The declarations above succeed, and with the default setting of `3` for `Twelf.chatter` we see

```%theorem trivI : ({A:o} nd (A imp A)) -> type.
%prove 2 {} (trivI D).
%mode -{D:{A:o} nd (A imp A)} trivI D.
% ------------------
/trivI/:  trivI ([A:o] impi ([D1:nd A] D1)).
% ------------------
```

The line starting with `%theorem` shows the way the theorem will be realized as a logic program predicate, the line starting with `/trivI/` gives the implementation, which, in this case, consists of just one line.

The second example is the type preservation theorem for evaluation in the lambda-calculus. This is a continuation of the example in Section section 5.6 Sample Program in the file `examples/guide/lam.elf'. Type preservation states that if and expression `E` has type `T` and `E` evaluates to `V`, the `V` also has type `T`. This is expressed as the following `%theorem` declaration.

```%theorem
tps : forall* {E:exp} {V:exp} {T:tp}
forall {D:eval E V} {P:of E T}
exists {Q:of V T}
true.
```

The proof proceeds by structural induction on `D`, the evaluation from `E` to `V`. Therefore we can search for the proof with the following declaration (where the size bound of `5` on proof term size is somewhat arbitrary).

```%prove 5 D (tps D P Q).
```

Twelf finds and displays the proof easily. The resulting program is installed in the global signature and can then be used to apply type preservation (see section 8.5 Proof Realizations).

## 8.3 Proof Steps

We expect the proof search component of Twelf to undergo major changes in the near future, so we only briefly review the current state.

Proving proceeds using three main kinds of steps:

Filling
Using iterative deepening, Twelf searches directly for objects to fill the existential quantifiers, given all the constants in the signature and the universally quantified variables in the theorem. The number of constructors in the answer substitution for each existential quantifier is bounded by the size which is given as part of the `%prove` declaration, thus guaranteeing termination (in principle).
Recursion
Based on the termination ordering, Twelf appeals to the induction hypothesis on smaller arguments. If there are several ways to use the induction hypothesis, Twelf non-deterministically picks one which has not yet been used. Since there may be infinitely many different ways to apply the induction hypothesis, the parameter `Twelf.Prover.maxRecurse` bounds the number of recursion steps in each case of the proof.
Splitting
Based on the types of the universally quantified variables, Twelf distinguishes all possible cases by considering all constructors in the signatures. It nevers splits a variable which appears as an index in an input argument, and if there are several possibilities it picks the one with fewest resulting cases. Splitting can go on indefinitely, so the paramater `Twelf.Prover.maxSplit` bounds the number of times a variable may be split.

## 8.4 Search Strategies

The basic proof steps of filling, recursion, and splitting are sequentialized in a simple strategy which never backtracks. First we attempt to fill all existential variables simultaneously. If that fails we recurse by trying to find new ways to appeal to the induction hypothesis. If this is not possible, we pick a variable to distinguish cases and then prove each subgoal in turn. If none of the steps are possible we fail.

This behavior can be changed with the parameter `Twelf.Prover.strategy` which defaults to `Twelf.Prover.FRS` (which means Filling-Recursion-Splitting). When set to `Twelf.Prover.RFS` Twelf will first try recursion, then filling, followed by splitting. This is often faster, but fails in some cases where the default strategy succeeds.

## 8.5 Proof Realizations

Proofs of meta-theorems are realized as logic programs. Such a logic program is a relational representation of the constructive proof and can be executed to generate witness terms for the existentials from given instances of the universal quantifiers. As an example, we consider once more type preservation (see section 8.2 Sample Theorems).

After the declarations,

```%theorem
tps : forall* {E:exp} {V:exp} {T:tp}
forall {D:eval E V} {P:of E T}
exists {Q:of V T}
true.

%prove 5 D (tps D P Q).
```

```/tps/tp_lam/ev_lam/:
tps ev_lam (tp_lam ([x:exp] [P2:of x T1] P1 x P2))
(tp_lam ([x:exp] [P3:of x T1] P1 x P3)).

/tps/tp_app/ev_app/tp_lam/:
tps (ev_app D1 D2 D3) (tp_app P1 P2) P6
<- tps D3 P2 (tp_lam ([x:exp] [P4:of x T2] P3 x P4))
<- tps D2 P1 P5
<- tps D1 (P3 E5 P5) P6.
```

which is the proof of type preservation expressed as a logic program with two clauses: one for evaluation of a lambda-abstraction, and one for application. Using the `%solve` declaration (see section 5.2 Solve Declaration) we can, for example, evaluate and type-check the identity applied to itself and then use type preservation to obtain a typing derivation for the resulting value.

```e0 = (app (lam [x] x) (lam [y] y)).
%solve p0 : of e0 T.
%solve d0 : eval e0 V.
%solve tps0 : tps d0 p0 Q.
```

Recall that `%solve c : V` executes the query `V` and defines the constant `c` to abbreviate the resulting proof term.