.\"     Start of program display
.de AA
.DS
.ft C
.lg 0
.ps -2
.vs -2
..

.\"     End of program display
.de ZZ
.ps +2
.vs +2
.lg
.ft R
.DE
..

.nr PS 11
.nr VS 13
.ps 11
.vs 13

.rm CH

.TL
A Useful Extension to Prolog's
.br
Definite Clause Grammar Notation
.AU
Peter Van Roy
vanroy@bellatrix.berkeley.edu
.AI
Computer Science Division
University of California
Berkeley, CA 94720
.AE
.NH
Introduction
.PP
Programming in a purely applicative style
implies that all information is passed in arguments.
However, in practice the number of arguments
becomes large,
which makes writing and maintaining
such programs difficult.
Two ways of getting around this problem are (1)
to encapsulate information in compound structures
which are passed in single arguments,
and (2) to use global instead of local information.
Both of these techniques are commonly used
in imperative languages such as C, but
neither is a satisfying way to program
in Prolog:
.IP \(bu
Because Prolog is a single-assignment language,
modifying encapsulated information
requires a time-consuming copy of the entire structure.
Sophisticated optimizations could make this efficient,
but compilers implementing them do not yet exist.
.IP \(bu
Modifying global information destroys the advantages
of programming in an applicative style,
such as the ease of mathematical analysis
and the suitability for parallel execution.
.LP
A third approach with none of the above
disadvantages is extending Prolog
to allow an arbitrary number of arguments
without increasing
the size of the source code.
The extended Prolog is translated
into standard Prolog by a preprocessor.
This article describes
an extension to Prolog's Definite Clause Grammar notation
that implements this idea.
.NH
Definite Clause Grammar (DCG) notation
.PP
DCG notation was developed as the result of
research in natural language parsing and understanding [Pereira & Warren 1980].
It allows the specification of
a class of attributed unification grammars
with semantic actions.
These grammars are strictly more powerful than context-free grammars.
Prologs that conform to the Edinburgh standard [Clocksin & Mellish 1981]
provide a built-in preprocessor that translates clauses written in
DCG notation into standard Prolog.
.PP
An important Prolog programming technique
is the accumulator [Sterling & Shapiro 1986].
The DCG notation implements a 
single implicit accumulator.
For example, the DCG clause:
.AA
term(S) --> factor(A), [+], factor(B), {S is A+B}.
.ZZ
is translated internally into the Prolog clause:
.AA
term(S,X1,X4) :- factor(A,X1,X2), X2=[+|X3], factor(B,X3,X4), S is A+B.
.ZZ
Each predicate is given two additional arguments.
Chaining together these arguments
implements the accumulator.
.NH
Extending the DCG notation
.PP
The DCG notation is a concise and clear way
to express the use of
a single accumulator.
However, in the development of large Prolog programs
I have found it useful to carry more than one
accumulator.
If written explicitly, each accumulator requires
two additional arguments,
and these arguments
must be chained together.
This requires the invention of many arbitrary variable names,
and the chance of introducing errors is large.
Modifying or extending this code, for example to add
another accumulator, is tedious.
.PP
One way to
solve this problem
is to extend the DCG notation.
The extension described here
allows for an unlimited number of
named accumulators,
and handles all the tedium of
parameter passing.
Each accumulator requires a 
single Prolog fact as its
declaration.
The bulk of the program source
does not depend on the number of accumulators, 
so maintaining and extending it is simplified.
For single accumulators
the notation defaults to the standard DCG notation.
.PP
Other extensions to the DCG notation have been proposed,
for example Extraposition Grammars [Pereira 1981]
and Definite Clause Translation Grammars [Abramson 1984].
The motivation for these extensions
is natural-language analysis, and
they are not directly useful
as aids in program construction.
.NH
An example
.PP
To illustrate the extended notation,
consider the following Prolog predicate
which converts infix expressions containing identifiers, integers, and
addition (+) into machine code for a simple stack machine, and also
calculates the size of the code:
.AA
expr_code(A+B, S1, S4, C1, C4) :-
	expr_code(A, S1, S2, C1, C2),
	expr_code(B, S2, S3, C2, C3),
	C3=[plus|C4],      /* Explicitly accumulate 'plus' */
	S4 is S3+1.        /* Explicitly add 1 to the size */
expr_code(I, S1, S2, C1, C2) :-
	atomic(I),
	C1=[push(I)|C2],
	S2 is S1+1.
.ZZ
This predicate has two accumulators:
the machine code and its size.
A sample call is \s-1\fCexpr_code(a+3+b,0,Size,Code,[])\s+1\fR, which returns
the result:
.AA
Size = 5
Code = [push(a),push(3),plus,push(b),plus]
.ZZ
With DCG notation it is possible to hide
the code accumulator, although the size
is still calculated explicitly:
.AA
expr_code(A+B, S1, S4) -->
	expr_code(A, S1, S2),
	expr_code(B, S2, S3),
	[plus],            /* Accumulate 'plus' in a hidden accumulator */
	{S4 is S3+1}.      /* Explicitly add 1 to the size */
expr_code(I, S1, S2) -->
	{atomic(I)},
	[push(I)],
	{S2 is S1+1}.
.ZZ
The extended notation
hides both accumulators:
.AA
expr_code(A+B) -->>
	expr_code(A),
	expr_code(B),
	[plus]:code,       /* Accumulate 'plus' in the code accumulator */
	[1]:size.          /* Accumulate 1 in the size accumulator */
expr_code(I) -->>
	{atomic(I)},
	[push(I)]:code,
	[1]:size.
.ZZ
The translation of this version
is identical to the original
definition.
The preprocessor needs the following declarations:
.AA
acc_info(code, T, Out, In, (Out=[T|In])).   /* Accumulator declarations */
acc_info(size, T, In, Out, (Out is In+T)).

pred_info(expr_code, 1, [size,code]).       /* Predicate declaration */
.ZZ
For each accumulator this declares the accumulating function,
and for each predicate this declares
the name, arity (number of arguments),
and accumulators it uses.
The order of the \s-1\fCIn\fR\s+1  and \s-1\fCOut\fR\s+1  arguments
determines whether accumulation proceeds in the
forward direction (see \s-1\fCsize\fR\s+1 ) or in the
reverse direction (see \s-1\fCcode\fR\s+1 ).
Choosing the proper direction
is important if the accumulating function
requires some of its arguments to be instantiated.
.NH
Concluding remarks
.PP
An extension to Prolog's DCG notation
that implements
an unlimited number of
named accumulators was developed
to simplify purely applicative Prolog programming.
A preprocessor for
C-Prolog and Quintus Prolog
is available by
anonymous ftp to arpa.berkeley.edu or by contacting the author.
Comments and suggestions for improvements are welcome.
.PP
This research was partially sponsored by the Defense Advanced Research
Projects Agency (DoD) and monitored by Space & Naval Warfare Systems
Command under Contract No. N00014-88-K-0579.
.NH
References
.LP
[Abramson 1984]
.IP
H. Abramson,
\*QDefinite Clause Translation Grammars,\*U
.I
Proc. 1984 International Symposium on Logic Programming,
.R
1984,
pp 233-240.
.LP
[Clocksin & Mellish 1981]
.IP
W.F. Clocksin and C.S. Mellish,
\*QProgramming in Prolog,\*U
.I
Springer-Verlag,
.R
1981.
.LP
[Pereira 1981]
.IP
F. Pereira,
\*QExtraposition Grammars,\*U
.I
American Journal of Computational Linguistics,
.R
1981,
vol. 7, no. 4, pp 243-255.
.LP
[Pereira & Warren 1980]
.IP
F. Pereira and D.H.D. Warren,
\*QDefinite Clause Grammars for Language Analysis\(emA Survey of the
Formalism and a Comparison with Augmented Transition Networks,\*U
.I
Journal of Artificial Intelligence,
.R
1980,
vol. 13, no. 3, pp 231-278.
.LP
[Sterling & Shapiro 1986]
.IP
L. Sterling and E. Shapiro,
\*QThe Art of Prolog,\*U
.I
MIT Press,
.R
1986.
