Twelf gives an operational interpretation to signatures under the computation-as-proof-search paradigm. The fundamental idea is to fix a simple search strategy and then search for a derivation of a query according to this strategy. The result may be a substitution for the free variables in a query and a derivation, or explicit failure. It is also possible that the computation does not terminate.
A query can be posed in three different ways: as a %query
declaration, as a %solve
declaration, or interactively, using a
top-level invoked from ML with Twelf.top
which prompts with
`?-' (see section 5.3 Interactive Queries).
query ::= id : term % X : A, X a free variable | term % A bound ::= nat % number of solutions | * % unbounded number qdecl ::= bound bound query % expected solutions, try limit, query qtdecl ::= bound bound query % number of stages, query sdecl ::= %define binding sdecl % term binding | %solve id : term % solve with proof term | %solve _ : term % solve binding ::= id = id % c = X, X a free variable | id = id : term % c = X : A decl ::= ... | %tabled id. % a, table family a | %query qdecl. % execute query | %querytabled qtdecl. % execute query with tabled logic programming | sdecl. % solve
In all of these cases, the free variables in a query are interpreted existentially, which is in contrast to constant declarations where free variables are interpreted universally. In particular, free variables might be instantiated during type reconstruction and during execution of the query.
The query form
%query expected try A.
will try to solve the query A
and verify that it gives the
expected number of solutions, but it will never try to find more
than indicated by try. It succeeds and prints a message, whose
precise form depends on the value of Twelf.chatter
if A
has the expected number of solutions; otherwise it either fails with an
error message or does not terminate. `%query' has no other effect
on the state of Twelf. Here are some examples.
%query 1 * A. % check thatA
has exactly one solution %query 1 1 A. % check thatA
has at least one solution %query * 3 A. %A
has infinitely many solutions, check 3 %query * * A. % fails ifA
has finitely many solutions %query 1 0 A. % skip this query
The query form
%solve c : A.
will search for the first solution M
of A
and then define
c : A = M.
Optionally, it is possible to perform additional binding for the free
variables in A
. Let X
a variable of type family B
appearing in A
, and let us assume that in the solution M
this variable has been substituted with the term N
; the query
form
%define d = X : B %solve c : A.
will also define
d : B = N.
If there are any free variables remaining in M
, N
or A
after search, they will be implicitly quantified in the new definitions.
This form of definition is particularly useful to compute and name inputs to future queries. An example of this feature from the file `examples/nd/lam.elf' can be found in section 10.5 Proof Realizations.
An interactive top-level can be invoked using the SML expression
Twelf.top ();
. The top-level prompts with `?- ' and
awaits the input of a query, terminated by a period `.' and
a RET.
After the query has been parsed, Twelf reconstructs implicit type information, issuing a warning if constraints remain. The result is executed as a query. At any point during the processing of a query the user may interrupt with C-c (that is, CTRL and c) to drop back into ML's interactive top-level.
When Twelf has found a solution, it prints the answer substitution for all free variables in the query, including the proof term variable if one was given. It also notes if there are remaining equational constraints, but currently does not print them.
The top-level then waits for input, which is interpreted as follows
As an example we consider lists of propositions and some simple operations on them, as they might be used when programming a theorem prover.
list : type. nil : list. cons : o -> list -> list.
First, we want to write a program to append two lists to obtain their concatenation. This is expressed as a relation between the three lists, which in turn is implemented as a type family
append : list -> list -> list -> type. appNil : append nil K K. appCons : append (cons X L) K (cons X M) <- append L K M.
Here, we use the synonym A <- B
for B -> A
to improve
readability. We say A
if B
.
The first sample query concatenates the singleton lists containing
true
and false
. We proceed as if we had loaded the appropriate
files and started a top-level with Twelf.top ();
.
?- append (cons true nil) (cons false nil) L.
Here, L
is a free existential variable. We search for an object
M : append (cons true nil) (cons false nil) L
, even though
this object will not be shown in this form or query. Each
constant declaration in the signature is tried in turn, unifying
the head with the goal above. In this manner, we obtain the
following sequence of goals and actions. Note that the intermediate
forms and comments are not printed when this is run. They are added
here to illustrate the behavior.
% original goal after parsing and type reconstruction ?- append (cons true nil) (cons false nil) L. [try appNil: append nil K1 K1 = append (cons true nil) (cons false nil) L unification fails with constant clash: nil <> cons ] [try appCons: append (cons X1 L1) K2 (cons X1 M1) = append (cons true nil) (cons false nil) L unification succeeds with X1 = true, L1 = nil, K2 = cons false nil, L = cons true M1 ] % subgoal ?- append nil (cons false nil) M1. [try appNil: append nil K3 K3 = append nil (cons false nil) M1 unification and subgoal succeeds with K3 = cons false nil, M1 = cons false nil ]
At this point the overall goal succeeds and we read off the
answer substitution for the only free variable in the query,
namely L
. It was first determined to be cons true M1
and then M1
was instantiated to cons false nil
,
leading to the instantiation
L = cons true (cons false nil).
If instead we pose the query
?- X : append (cons true nil) (cons false nil) L.
we also obtain the proof term
L = cons true (cons false nil); X = appCons appNil.
As another example we consider a query with several solutions which are enumerated when we ask for further results. This time we do not trace the steps of the execution, but show the interaction verbatim.
?- append L K (cons true (cons false nil)). Solving... K = cons true (cons false nil); L = nil. More? y K = cons false nil; L = cons true nil. More? y K = nil; L = cons true (cons false nil). More? y No more solutions
The operational semantics of Twelf is a form of typed constraint logic
programming. We will use standard terminology from this area. A type
family which is used in a program or goal is called a predicate.
A constant declaration in a signature which is available during search
is called a clause. A clause typically has the form c : a
M1 ... Mm <- A1 <- ... <- An
, where a M1 ... Mm
is
the head of the clause and A1
through An
are the
subgoals. A clause is used to reduce a goal to subgoals by a
process called backchaining. Backchaining unifies the head of
the clause with the current goal to generate subgoals. Next, we
select one of the subgoals as a current goal and continue the
search process. Actually, instead of unification (which is undecidable
in LF), Twelf employs constraint simplification and carries along
equational constraints in a normal form.
A hypothesis which is introduced during search is a local assumption; a parameter is a local parameter. Parameters act like constants in unification, except that their occurrences might be restricted due to parameter dependency.
Without going into a formal description, here are the central ideas of the operational semantics.
c : A <- B <- C.
is applied to solve the goal ?- A.
then the
first subgoal is B
and the second subgoal C
. Truly
dependent variables will only be subject to unification and never
give rise to a subgoal. For example c : {X:b} a X <- a c
is
a clause with head a X
, subgoal a c
, and existential variable
X
.
?- A -> B.
introduces a local assumption A
and then solves B
under this assumption. To solve atomic goals,
local assumptions are tried before global clauses, using the most
recently made assumption first. Note that this is different from
Prolog assert
in that A
is available only for solving
B
.
?- {x:A} B.
which generates a new parameter a
and
then solves the result of substituting a
for x
in
B
. Parameters are also called universal variables since
they are not subject to instantiation during unification. Local
parameters will never be used as local assumptions during search.
When a signature is read, some minimal amount of syntactic translation
may be applied to it in order to speed up execution. We refer to this
process as compilation, in an abuse of the ordinary use of the term.
Compilation will try to eliminate expensive appeals to unification
with assignment where this optimization is sound. This can be disabled
by setting the parameter Twelf.Compile.optimize
to false
.
As an example, we consider simple type inference for the pure lambda-calculus. An extension of this example to Mini-ML is given in the course notes Pfenning 1992, Computation and Deduction. The code below can be found in the file `examples/guide/lam.elf'.
% Simple types tp : type. %name tp T. arrow : tp -> tp -> tp. % T1 => T2 % Expressions exp : type. %name exp E. lam : (exp -> exp) -> exp. % lam x. E app : exp -> exp -> exp. % (E1 E2) % Type inference % |- E : T (expression E has type T) of : exp -> tp -> type. %name of P. tp_lam : of (lam E) (arrow T1 T2) % |- lam x. E : T1 => T2 <- ({x:exp} % if x:T1 |- E : T2. of x T1 -> of (E x) T2). tp_app : of (app E1 E2) T1 % |- E1 E2 : T1 <- of E1 (arrow T2 T1) % if |- E1 : T2 => T1 <- of E2 T2. % and |- E2 : T2.
Again, we have used the notation A <- B
to emphasize
the interpretation of constant declarations as clauses. We now
trace the query which infers the most general type of the identity
function, represented as lam [x:exp] x
. We indicate the scope
of hypotheses which are introduced during search by indentation.
% original query,T
free ?- of (lam [x:exp] x) T. % usetp_lam
withE = ([x:exp] x)
andT = arrow T1 T2
% subgoal ?- {x:exp} of x T1 -> of x T2. % introduce parametere
?- of e T1 -> of e T2. % introduce hypothesis labeledp
p:of e T1 ?- of e T2. % succeed by hypothesisp
withT1 = T2
At this point the query succeeds and prints the answer substitution.
T = arrow T1 T1. More? y No more solutions
We requested more solution by typing y, but there are no further
possibilities. The free variable T1
in the answer substitution
means that every instance of arrow T1 T1
provides a solution to
the original query. In other words, lam [x:exp] x
has type
arrow T1 T1
for all types T1
.
As a second example we verify that self-application is not well-typed in the simply-typed lambda-calculus.
?- of (lam [x:exp] app x x) T. % usetp_lam
withT = arrow T1 T2
% subgoal ?- {x:exp} of x T1 -> of (app x x) T2. % introduce parametere
?- of e T1 -> of (app e e) T2. % introduce hypothesisp:of a T1
p:of e T1 ?- of (app e e) T2. % usetp_app
% first subgoal ?- of e (arrow T3 T2). % succeed by hypothesisp
withT1 = arrow T3 T2
% second subgoal ?- of e T3. % fail, sinceT3 = arrow T3 T2
has no solution
At the point where the second subgoals fails we backtrack. However, all other alternatives fail immediately, since the clause head does not unify with the goal and the overall query fails.
Definitions interact with the logic programming interpretation of signatures in two ways.
At present family-level definitions are transparent for type-checking, but
opaque for proof search. This means, if a : type = b
,
the constants defining a and b are not mixed.
Disclaimer: This is use type-level definitions is still under consideration and is discouraged because it might change in future releases. Moreover, type-level definitions at present do not interact correctly with coverage checking and can lead to unsoundness.
The second interaction is that defined object-level constants are generally not used for the logic programming interpretation. However, they can be forced to be used with a declaration
%clause d : A = M.
For search, it will be treated exactly as if it were d :
A.
except that the resulting proof term will contain a defined
constant instead of a declared constant.
In general, a solvable goal may admit several (and possibly countably infinite) solutions. Through backtracking, Twelf will generate all of them. There are circumstances, however, where only the only one of them is correct, or interesting. Consider, for example, list membership:
element : type. %name element X. a : element. b : element. c : element. list : type. %name list L. nil : list. cons : element -> list -> list. member : element -> list -> type. member1 : member X (cons X L). member2 : member X (cons Y L) <- member X L.
the query
?- member a (cons a (cons b (cons a nil)))
will succeed twice. Since all we are interested in is whether
a
is contained in the list or not, it may suffice for it to
succeed once.
Discarding unwanted solutions in Twelf is accomplished through the
%deterministic
directive. Declaring a type family deterministic
will cause all queries involving it to succeed at most once: if there
are several solutions, only the first one will be found. So for example
declaring
%deterministic member.
will have the effect of preventing backtracking in any query involving
the type family member
, once one solution has been found:
?- member a (cons a (cons b (cons a nil))). Solving... Empty Substitution. More? y No more solutions
In other logic programming languages like Prolog, pruning of unwanted
solutions is usually accomplished by using the extra-logical operator
cut. In general, %deterministic
is less powerful than cut: in
a language with cut, a deterministic predicate can be modeled by
adding a cut at the end of all of its clauses. However, in most
practical uses the two are equivalent, and the semantics of
%deterministic
families is cleaner and better understood.
Logic programming uses a simple depth-first search strategy to search for a proof of a given query. This strategy is incomplete, that is, there are queries that are true, but the logic programming engine will not find a proof due to non-termination. In addition, performance may be hampered by redundant computation.
Tabled logic programming uses memoization to alleviate these problems by
avoiding infinite and redundant paths of computation. The central data
structure is a table in which we store encountered subgoals and
corresponding solutions. When we solve a subgoal G
, then we
check whether G
is in the table. If it is not in the table, then
it will be added. If it is in the table and there are solutions
available, then we can re-use them. If it is in the table and no
solutions are available, then we suspend the computation. This basic
idea is combined with global stages. In each stage, a depth-first
search strategy is used to derive answers from the program. If no more
answers can be derived, computation terminates. It is important to note
that for each answer only one proof is generated, although multiple
different proofs may exist.
Tabled and non-tabled execution can be freely mixed. This is achieved by requiring an explicit declaration
%tabled a.
for every type family a
that is to be tabled. In addition,
at present tabling must be explicitly enabled by invoking
%querytabled expected stages A.
where expected is the expected number of answers and
stages bounds the number of stages to use before terminating
while searching for a proof of A
. If expected is *
,
it will find all distinct solutions. If there are finitely many solutions,
search will terminate, after enumerating all of them. If stages is *
,
then arbitrarily many stages may be explored, which can lead to
non-termination over infinite domains.
To illustrate, here is a small simple example that computes reachability in a cyclic graph. First we declare the nodes in the graph and the edge relation.
node : type. a : node. b : node. c : node. d : node. edge: node -> node -> type. e_ab : edge a b. e_ac : edge a c. e_ba : edge b a. e_bd : edge b d.
Next we declare the reachability relation and declare it tabled in order to avoid an infinite loop.
reach: node -> node -> type. %tabled reach. r_refl: reach X X. r_cl : reach X Y <- edge X Z <- reach Z Y. %querytabled 4 * (reach a X).
Note that, at present, tables will be completely ignored for ordinary queries, so that
%query * * reach a X.
will not terminate, despite the fact that reach
has been
declared as tabled.
In the example above we can also give an explicit bound on the number of stages during tabled evaluation with
%querytabled 4 3 (reach a X).
Additional examples using tabling can be found in the directory `examples/tabled/' of the distribution.
One key question in tabled search is how to detect that a subgoal is similar to another sub-goal which is already in the table. There are two critical parameters which influence table lookup during search.
Twelf.Table.strengthen
which defaults to false
.
If true
it eliminates dependencies based on subordination
during tabling. This is more expensive, but can lead to increased
re-use of prior subgoals in some cases.
Twelf.Table.strategy
which defaults to Twelf.Table.Variant
.
When at its default setting, Twelf.Table.Variant
, then subgoals are
compared for equality modulo renaming of existential and bound variables.
When set to Twelf.Table.Subsumption
, then comparison
of subgoals with table entries allows instantiation of the table
entries. This is slower, but may lead to increased sharing and
better termination properties.
It is also possible (in analogy with Twelf.top ();
, explained
in section 5.3 Interactive Queries, to start an interactive top level that
executes the tabling logic programming interpreter with the command
Twelf.Table.top ();
Furthermore, it should be noted that tracing does not work for the tabled logic programming engine, and that other analyses such as termination checking do not take the tabling into account. This is conservative, but means that many terminating tabled programs can at present not be verified.
Go to the first, previous, next, last section, table of contents.