Newsgroups: comp.ai.nat-lang
Path: cantaloupe.srv.cs.cmu.edu!das-news2.harvard.edu!news2.near.net!news.mathworks.com!europa.eng.gtefsd.com!newsxfer.itd.umich.edu!nntp.cs.ubc.ca!fornax!jamie
From: jamie@cs.sfu.ca (Jamie Andrews)
Subject: Re: parse unknown words
Message-ID: <1994Oct25.165507.8404@cs.sfu.ca>
Organization: Faculty of Applied Science, Simon Fraser University
References: <1994Oct21.175206.5625@seas.smu.edu>
Date: Tue, 25 Oct 1994 16:55:07 GMT
Lines: 28

In article <1994Oct21.175206.5625@seas.smu.edu>,
Ted Pedersen <pedersen@seas.smu.edu> wrote:
>I'd like to find an implementation (in Prolog especially) of a parser
>that tries to parse sentences with unknown words. 
>
>I'm especially interested in parsers that use a phrase structure
>grammar and will attempt to parse sentences with unknown words as best
>they can using modifications to fairly standard parsing algorithms
>(Earley's, BUP, etc.)  

     If you have a Prolog parser, it should be fairly easy to
modify it to parse unknown words.  Add a "dictionary addenda"
argument to every predicate; this would be a list of terms of
the form Word=Part_of_speech (I'm simplifying, obviously).

     Then, if a word is not in the standard dictionary, "look"
to see if it's a member of the "dictionary addenda".  If you
pass an uninstantiated variable as the original dictionary
addenda argument, the system should eventually instantiate it to
a dictionary with all the unknown words.

     There are other issues, e.g. whether you want to have
multiple entries in the dictionary for new words, but this basic
framework should work.

--Jamie.
  jamie@cs.sfu.ca
"Make sure Reality is not twisted after insertion"
