








               A Useful Extension to Prolog's
              Definite Clause Grammar Notation


                       Peter Van Roy
               vanroy@bellatrix.berkeley.edu

                 Computer Science Division
                  University of California
                     Berkeley, CA 94720





_1.  _I_n_t_r_o_d_u_c_t_i_o_n

     Programming in a purely applicative style implies  that
all  information  is passed in arguments.  However, in prac-
tice the number of  arguments  becomes  large,  which  makes
writing  and  maintaining such programs difficult.  Two ways
of getting around this problem are (1) to encapsulate infor-
mation  in  compound  structures  which are passed in single
arguments, and (2) to use global instead of  local  informa-
tion.  Both of these techniques are commonly used in impera-
tive languages such as C, but neither is a satisfying way to
program in Prolog:

o+    Because Prolog is a single-assignment language, modify-
     ing  encapsulated information requires a time-consuming
     copy of the entire structure.  Sophisticated  optimiza-
     tions  could  make this efficient, but compilers imple-
     menting them do not yet exist.

o+    Modifying global information destroys the advantages of
     programming  in  an applicative style, such as the ease
     of mathematical analysis and the suitability for paral-
     lel execution.

A third approach with none of  the  above  disadvantages  is
extending  Prolog  to allow an arbitrary number of arguments
without  increasing  the  size  of  the  source  code.   The
extended  Prolog  is  translated  into  standard Prolog by a
preprocessor.   This  article  describes  an  extension   to
Prolog's  Definite  Clause  Grammar notation that implements
this idea.

_2.  _D_e_f_i_n_i_t_e _C_l_a_u_s_e _G_r_a_m_m_a_r (_D_C_G) _n_o_t_a_t_i_o_n

     DCG notation was developed as the result of research in
natural language parsing and understanding [Pereira & Warren



                       








1980].  It allows the specification of a class of attributed
unification  grammars with semantic actions.  These grammars
are strictly more powerful than context-free grammars.  Pro-
logs  that  conform  to  the  Edinburgh standard [Clocksin &
Mellish  1981]  provide   a   built-in   preprocessor   that
translates  clauses  written  in  DCG notation into standard
Prolog.

     An important Prolog programming technique is the  accu-
mulator  [Sterling & Shapiro 1986].  The DCG notation imple-
ments a single implicit accumulator.  For example,  the  DCG
clause:

        term(S) --> factor(A), [+], factor(B), {S is A+B}.

is translated internally into the Prolog clause:

        term(S,X1,X4) :- factor(A,X1,X2), X2=[+|X3], factor(B,X3,X4), S is A+B.

Each predicate is given two additional arguments.   Chaining
together these arguments implements the accumulator.

_3.  _E_x_t_e_n_d_i_n_g _t_h_e _D_C_G _n_o_t_a_t_i_o_n

     The DCG notation is a concise and clear way to  express
the  use  of a single accumulator.  However, in the develop-
ment of large Prolog programs I  have  found  it  useful  to
carry  more  than  one  accumulator.  If written explicitly,
each accumulator  requires  two  additional  arguments,  and
these arguments must be chained together.  This requires the
invention of many arbitrary variable names, and  the  chance
of introducing errors is large.  Modifying or extending this
code, for example to add another accumulator, is tedious.

     One way to solve this problem  is  to  extend  the  DCG
notation.  The extension described here allows for an unlim-
ited number of  named  accumulators,  and  handles  all  the
tedium  of  parameter  passing.  Each accumulator requires a
single Prolog fact as its declaration.  The bulk of the pro-
gram  source  does not depend on the number of accumulators,
so maintaining and extending it is simplified.   For  single
accumulators the notation defaults to the standard DCG nota-
tion.

     Other extensions to the DCG  notation  have  been  pro-
posed, for example Extraposition Grammars [Pereira 1981] and
Definite Clause Translation Grammars [Abramson  1984].   The
motivation   for   these   extensions   is  natural-language
analysis, and they are not directly useful as aids  in  pro-
gram construction.







                       








_4.  _A_n _e_x_a_m_p_l_e

     To illustrate the extended notation, consider the  fol-
lowing  Prolog  predicate  which  converts infix expressions
containing identifiers,  integers,  and  addition  (+)  into
machine code for a simple stack machine, and also calculates
the size of the code:

        expr_code(A+B, S1, S4, C1, C4) :-
                expr_code(A, S1, S2, C1, C2),
                expr_code(B, S2, S3, C2, C3),
                C3=[plus|C4],      /* Explicitly accumulate 'plus' */
                S4 is S3+1.        /* Explicitly add 1 to the size */
        expr_code(I, S1, S2, C1, C2) :-
                atomic(I),
                C1=[push(I)|C2],
                S2 is S1+1.

This predicate has two accumulators: the  machine  code  and
its size.  A sample call is expr_code(a+3+b,0,Size,Code,[]),
which returns the result:

        Size = 5
        Code = [push(a),push(3),plus,push(b),plus]

With DCG notation it is possible to hide the code  accumula-
tor, although the size is still calculated explicitly:

        expr_code(A+B, S1, S4) -->
                expr_code(A, S1, S2),
                expr_code(B, S2, S3),
                [plus],            /* Accumulate 'plus' in a hidden accumulator */
                {S4 is S3+1}.      /* Explicitly add 1 to the size */
        expr_code(I, S1, S2) -->
                {atomic(I)},
                [push(I)],
                {S2 is S1+1}.

The extended notation hides both accumulators:

        expr_code(A+B) -->>
                expr_code(A),
                expr_code(B),
                [plus]:code,       /* Accumulate 'plus' in the code accumulator */
                [1]:size.          /* Accumulate 1 in the size accumulator */
        expr_code(I) -->>
                {atomic(I)},
                [push(I)]:code,
                [1]:size.

The translation of this version is identical to the original
definition.   The  preprocessor needs the following declara-
tions:




                       








        acc_info(code, T, Out, In, (Out=[T|In])).   /* Accumulator declarations */
        acc_info(size, T, In, Out, (Out is In+T)).

        pred_info(expr_code, 1, [size,code]).       /* Predicate declaration */

For each accumulator this declares  the  accumulating  func-
tion,  and  for each predicate this declares the name, arity
(number of arguments), and accumulators it uses.  The  order
of  the  In  and Out  arguments determines whether accumula-
tion proceeds in the forward direction (see size ) or in the
reverse  direction  (see code ).  Choosing the proper direc-
tion is important if the accumulating function requires some
of its arguments to be instantiated.

_5.  _C_o_n_c_l_u_d_i_n_g _r_e_m_a_r_k_s

     An extension to Prolog's DCG notation  that  implements
an  unlimited  number of named accumulators was developed to
simplify purely applicative Prolog programming.   A  prepro-
cessor  for  C-Prolog  and  Quintus  Prolog  is available by
anonymous ftp to  arpa.berkeley.edu  or  by  contacting  the
author.   Comments and suggestions for improvements are wel-
come.

     This research was partially sponsored  by  the  Defense
Advanced  Research  Projects  Agency  (DoD) and monitored by
Space & Naval Warfare Systems  Command  under  Contract  No.
N00014-88-K-0579.

_6.  _R_e_f_e_r_e_n_c_e_s

[Abramson 1984]

     H.  Abramson,  Definite  Clause  Translation  Grammars,
     _P_r_o_c.  _1_9_8_4  _I_n_t_e_r_n_a_t_i_o_n_a_l  _S_y_m_p_o_s_i_u_m _o_n _L_o_g_i_c _P_r_o_g_r_a_m_-
     _m_i_n_g, 1984, pp 233-240.

[Clocksin & Mellish 1981]

     W.F. Clocksin and C.S. Mellish, Programming in  Prolog,
     _S_p_r_i_n_g_e_r-_V_e_r_l_a_g, 1981.

[Pereira 1981]

     F. Pereira, Extraposition Grammars, _A_m_e_r_i_c_a_n _J_o_u_r_n_a_l _o_f
     _C_o_m_p_u_t_a_t_i_o_n_a_l  _L_i_n_g_u_i_s_t_i_c_s,  1981,  vol.  7,  no. 4, pp
     243-255.

[Pereira & Warren 1980]

     F. Pereira and D.H.D. Warren, Definite Clause  Grammars
     for  Language  Analysis-A Survey of the Formalism and a
     Comparison with Augmented Transition Networks,  _J_o_u_r_n_a_l
     _o_f  _A_r_t_i_f_i_c_i_a_l  _I_n_t_e_l_l_i_g_e_n_c_e,  1980, vol. 13, no. 3, pp



                       








     231-278.

[Sterling & Shapiro 1986]

     L. Sterling and E. Shapiro,  The  Art  of  Prolog,  _M_I_T
     _P_r_e_s_s, 1986.



















































                       


