Newsgroups: comp.lang.prolog
Path: cantaloupe.srv.cs.cmu.edu!rochester!udel!news.sprintlink.net!in2.uu.net!munnari.oz.au!cs.mu.OZ.AU!munta.cs.mu.OZ.AU!fjh
From: fjh@munta.cs.mu.OZ.AU (Fergus Henderson)
Subject: Re: Strings in DCG-style Chart Parsing
Message-ID: <9529201.19103@mulga.cs.mu.OZ.AU>
Sender: news@cs.mu.OZ.AU (CS-Usenet)
Organization: Computer Science, University of Melbourne, Australia
References: <PEREIRA.95Sep30104755@alta.research.att.com> <1995Oct3.095758.26601@let.rug.nl> <PEREIRA.95Oct3195504@alta.research.att.com> <4563vn$aj7@idefix.CS.kuleuven.ac.be>
Date: Wed, 18 Oct 1995 15:42:24 GMT
Lines: 122

tarau@CS.kuleuven.ac.be (Paul Tarau) writes:

>TRANSLATION BASED IMPLEMENTATION OF DCGs is HARMFUL.

Very interesting article.

>Although widely used and historically an important factor of the
>success of LP languages, the translation-based implementation of DCGs
>has become a `software engineering' bottleneck for the practical use of
>logic grammars. Some of the reasons follow:
>
>- although DCGs as proposed by Pereira and Warren were intended
>  to be representation independent, most translations actually implement
>  metamorphosis grammars and assume a hard-coded list-representation of the
>  input-stream (worse, they are all doing it slightly differently,
>  just enough to break your grammar :-))

Only the list syntax for terminal symbols assumes a hard-coded list
representation.  If you simply avoid using that particular part of the
DCG syntax, then you can use any representation you like.  For example,
it is easy to define a predicate token/3 which gets the next token
using your chosen representation; if you use 

	foo --> token(X), token(Y)
	
instead of 

	foo --> [X, Y]

then there will be no problems with using an alternate representation.
For example, Mercury programs which do I/O typically use a DCG where
the implicit DCG arguments are an abstract type `io__state' which
represents the state of the world (this abstract type is not
implemented as a list ;-).

>- DCG meta-programming with phrase/3 is expensive as it implies
>  expanding on the fly, with lot of structure-crunching
>  so that programmers often write ugly and data-dependent grammars
>  for efficiency reasons (as a symptom, look for repeated occurrences
>  of the same morphological item at various syntactic and semantic levels)

I don't quite understand what you mean here.  Could you give an
example of what you mean by "DCG meta-programming with phrase/3",
and explain why it is expensive?  (Is this simply a problem with
current Prolog implementation technology that could be solved with
a sufficiently smart compiler?)

>- there's no simple and efficient way to translate  multiple-stream 
>  DCGs as Peter VanRoy's extended DCGs (also used in Wild-Life).
>  Translation based implementations of EDCGs need extra
>  declarations and result in a fairly complex preprocessor

I've been kicking around an idea about how to do this for a while now.
It's simple, needs no extra declarations, and results in a fairly
simple preprocessor.  As for efficiency, well, if you pass down N DCGs,
you do end up with 2N extra arguments, but that doesn't seem to be too
high a price, and a smart compiler could do a very good job of
optimizing that sort of code.  Someday, when I get time, I will write
it down, and maybe even post it to comp.lang.prolog...

>- source-level debugging is awkward in the presence of DCG 
>  translation (as it is the case also with macro-intensive C or Lisp).

I haven't found that to be a problem.

DCG translation is a single, simple, fixed translation,
whereas in macro-intensive C you have multiple levels
of programmer-defined complex #ifdef'd macros to trace through.
There is a world of difference.

>At this point, I am quite convinced that the translation is
>avoidable (see BinProlog 4.00's HAGs = Hidden Argument Grammars)
>with comparable speed and better memory usage (proportional
>to the number of choice points, not the number of calls).
>
>HAGs have a WAM-level optimal implementation in BinProlog 4.00
>where they can be used as follows:
>
>sent:-ng,v.        ng:-art,n.
>art:- #the.        art:- #a.
>  n:- #cat.          n:- #dog.
>  v:- #walks.        v:- #sleeps.

Can you please tell me the declarative semantics of `sent'?
How about the declarative semantics of `art'?

Translation-based DCGs have a simple translation-based declarative
semantics.  Do HAGs have a declarative semantics?  What are the
semantics of `,' in a language with HAGs - and is it related to logical
conjunction?  Some of us consider declarative logical semantics to be an
important issue in logic programming.

>Otherwise, it would be a pity to see a given instance of the
>translation based implementation of DCGs standardized in ISO Prolog.
>I think that this would preclude widespread use of EDCGs and HAGs,
>not to mention the new monadic-I/O of Escher and Mercury which
>offer equivalent functionality and would make great additions to Prolog.

A correction - I/O in Mercury is not monadic.  It is based on unique
modes, which are similar to linear types in functional programming,
not on monads, which are a somewhat different way of achieving a
similar effect.  Monadic I/O is based in an essential way on the use
of higher-order functions; although Mercury does support higher-order
predicates (the equivalent to higher-order functions), Mercury's I/O
system does not need to use them.  Mercury's I/O is closer to I/O in the
functional programming language Clean using Clean's UNQ types than it is
to Haskell's monadic I/O.

Use of translation-based DCGs certainly doesn't preclude the use of
Mercury's I/O system; on the contrary, translation-based DCGs are very
useful for writing I/O code in Mercury.  In Mercury, DCGs are
translation-based, and the language reference manual specifies this.
Indeed, quite a bit of the DCG code in the Mercury compiler - and *all*
of the type and mode declarations for DCG predicates - assume a
translation based implementation of DCGs.

-- 
Fergus Henderson             	WWW: http://www.cs.mu.oz.au/~fjh
fjh@cs.mu.oz.au              	PGP: finger fjh@128.250.37.3
-- 
Fergus Henderson             	WWW: http://www.cs.mu.oz.au/~fjh
fjh@cs.mu.oz.au              	PGP: finger fjh@128.250.37.3
