Newsgroups: comp.ai.nat-lang,comp.lang.prolog
Path: cantaloupe.srv.cs.cmu.edu!rochester!udel!news.mathworks.com!uunet!allegra!alice!pereira
From: pereira@radish.research.att.com (Fernando Pereira)
Subject: Re: code for finite automata
In-Reply-To: vannoord@let.rug.nl's message of Sun, 12 Feb 1995 21:36:50 GMT
X-Nntp-Posting-Host: radish.research.att.com
Message-ID: <PEREIRA.95Feb12213106@radish.research.att.com>
Sender: usenet@research.att.com (netnews <9149-80593> 0112740)
Reply-To: pereira@research.att.com
Organization: AT&T Bell Laboratories
References: <3hfq8c$9cf@lyra.csx.cam.ac.uk> <PEREIRA.95Feb10222452@radish.research.att.com>
	<ET.95Feb12104515@burns.cogsci.ed.ac.uk>
	<1995Feb12.213650.17023@let.rug.nl>
Date: Mon, 13 Feb 1995 02:31:06 GMT
Lines: 34
Xref: glinda.oz.cs.cmu.edu comp.ai.nat-lang:2887 comp.lang.prolog:12265

In article <1995Feb12.213650.17023@let.rug.nl> vannoord@let.rug.nl (Gertjan van Noord) writes:
   >In article <PEREIRA.95Feb10222452@radish.research.att.com> pereira@radish.research.att.com (Fernando Pereira) writes:
   >
   >> I wouldn't hold my breath for really efficient FSA intersection,
   >> determinization or minimization in Prolog. All those algorithms
   >> depend on imperative algorithms to achieve decent
   >> performance. Imperative algorithms in Prolog require using assert
   >> etc, which have awful constant multipliers. Either Prolog extensions
   >
   as an exercise (for myself..) I implemented a determinizator. Please
   bear in mind Fernando's comment in judging it. I concentrated in making
   the resulting program as fast as possible, rather than the conversion
   step.
Just looked at your code. The main efficiency problem is the check
whether a deterministic state (state subset) already exists. Its cost
is linear on the sums of the cardinalities of all states already
built, which is not too good for large result automata or
deterministic states corresponding to large state subsets. In
contrast, with a suitable data structure linear can become log, or,
using hashing, constant in average. My prejudice here is that I tend
to work with automata with 10^5-10^6 states...




--
Fernando Pereira
2B-441, AT&T Bell Laboratories
600 Mountain Ave, PO Box 636
Murray Hill, NJ 07974-0636
pereira@research.att.com
1-908-582-3980


