Newsgroups: comp.lang.prolog
Path: cantaloupe.srv.cs.cmu.edu!das-news2.harvard.edu!news2.near.net!bloom-beacon.mit.edu!uhog.mit.edu!news.mathworks.com!udel!gatech!howland.reston.ans.net!pipex!uunet!allegra!ulysses!alice!pereira
From: pereira@alta.research.att.com (Fernando Pereira)
Subject: Re: DCGs
In-Reply-To: geoff@coral.cs.jcu.edu.au's message of 10 Nov 94 04:03:31 GMT
Message-ID: <PEREIRA.94Nov10222635@alta.research.att.com>
Sender: netnews@ulysses.homer.att.com (Shankar Ishwar)
Reply-To: pereira@research.att.com
Organization: AT&T Bell Laboratories
References: <geoff.784440211@coral.cs.jcu.edu.au>
Date: Fri, 11 Nov 1994 03:26:35 GMT
Lines: 41

In article <geoff.784440211@coral.cs.jcu.edu.au> geoff@coral.cs.jcu.edu.au (Geoff Sutcliffe) writes:
   One of the researchers in Building Management is using DCGs to some parsing
   of data files (yes, I'll tell him to submit to Prolog 1000!). He has a
   problem that the files are very large, so he cannot get all the tokens into
   a list to be submitted to the DCG. I've been trying to dream up a way of
   making 'demand driven lists' (ala Scheme's streams) for DCGs, but have 
   failed.

Here's a technique that I have used in Quintus Prolog. It works,
although it is on the slow side. In the DCG, instead of using [X] for
a terminal symbol X, use `X, with the following op and predicate defns

:- op(500, fx, `).

`(C, @(P0,Stream), @(P,Stream)) :-
   stream_position(Stream, _, P0),
   get0(Stream, C),
   stream_position(Stream, P).

If the grammar start symbol is S and the file to parse is File, define

parse_from_file(S, File) :-
   open(File, read, Stream),
   stream_position(Stream, Pos0),
   phrase(S, @(Pos0,Stream), @(Pos, _)),
   stream_position(Stream, _, Pos),
   get0(Stream, C), is_end_of_file(C), % optional: file is completely parsed
   close(Stream).

(I modified my actual code a bit for this posting. Caveat...)

The relative slowness is due to the constant fiddling with stream
positions, which involves going through several layers of Prolog, C,
I/O libraries and system calls. But using assert to implement some
form of streams would be even worse.
--
Fernando Pereira
2D-447, AT&T Bell Laboratories
600 Mountain Ave, PO Box 636
Murray Hill, NJ 07974-0636
pereira@research.att.com
