Newsgroups: comp.lang.prolog
Path: cantaloupe.srv.cs.cmu.edu!das-news2.harvard.edu!news2.near.net!news.mathworks.com!uhog.mit.edu!bloom-beacon.mit.edu!gatech!howland.reston.ans.net!Germany.EU.net!EU.net!sun4nl!sci.kun.nl!cs.kun.nl!markjan
From: markjan@cs.kun.nl (Mark-Jan Nederhof)
Subject: Re: DCG parsing
Message-ID: <Cz9LK7.EzE@sci.kun.nl>
Sender: news@sci.kun.nl (News owner)
Nntp-Posting-Host: zeus.cs.kun.nl
Organization: University of Nijmegen, The Netherlands
X-Newsreader: NN version 6.4.19
References: <1994Nov8.185049.29988@seas.smu.edu>
Date: Mon, 14 Nov 1994 15:55:19 GMT
Lines: 62

In <1994Nov8.185049.29988@seas.smu.edu> pedersen@seas.smu.edu (Ted Pedersen) 
writes:

>Normally DCG grammars are parsed left to right.  What would be involved
>in making a parser that would take a DCG grammar and parse sentences
>right to left?? [...]

Now that this thread has moved away somewhat from DCGs and now that 
bidirectional parsing has been mentioned, I would like to explain a
particular approach towards bidirectionality which may provide some new
insights. I will simplify the treatment by only discussing context-free
parsing.

One of the most elegant ways to obtain tabular parsing (or `chart'
parsing, as it is often called in NLP) was described by Bernard Lang in
1974. The idea was that a nondeterministic pushdown automaton (PDA) is 
taken as starting-point, and this PDA is simulated using a table (cf.
memo-table), so that a cubic time complexity is achieved, even if there 
are exponentially many parses for a certain input.
An example of the application of Lang's ideas is that (with some extra
notions that I will not go into) Earley's algorithm can be derived from
a PDA doing nondeterministic top-down parsing. (Also Tomita's
algorithm is very related to what you would get if you applied
Lang's construction to a PDA doing LR parsing.)

A PDA obviously has a left-to-right dependency: the input to
the left of some input position has to be processed before the input to
the right of that position can be processed. Applying Lang's
construction in a straightforward way to such a PDA leads to a tabular
algorithm which has that same left-to-right dependency.

The interesting thing now is that this left-to-right dependency can be
avoided for the tabular algorithms by slightly changing the construction 
of Lang (following a paper by Aho, Hopcroft and Ulmann from 1968). 
The result is that the tabular parsing algorithm:
1) remains correct in the sense that only correct parses are computed.
2) is basically less efficient, because the left-to-right dependency would 
  filter away useless subparses. This filtering is now no longer active.
3) one can start parsing the input at any input position, or start
  parsing *simultaneously* from *all* input positions 
  (in effect, do bidirectional parsing).

It is interesting that any PDA can be taken as starting-point.
So one could take e.g. a normal left-to-right left-corner PDA recognizer 
with top-down filtering, but the resulting tabular algorithm derived from 
this PDA could process the input from say right to left, or from the 
middle of the input outward...

My Ph.D. thesis (Chapter 1) explains all this more carefully.
(Ask me to send you a copy, provided you're really interested.)

>[...] I suppose there are implications for
>parallel parsing as well [...]

Indeed. In Chapter 3 of my thesis I mention a parallel algorithm
which results from taking a PDA and then transforming it into a tabular
algorithm while eliminating the left-to-right dependency (in the form of
top-down filtering).

Hope this helps,
  Mark-Jan Nederhof
  University of Nijmegen
