Newsgroups: comp.lang.lisp.mcl
Path: cantaloupe.srv.cs.cmu.edu!bb3.andrew.cmu.edu!newsfeed.pitt.edu!gatech!swrinde!tank.news.pipex.net!pipex!peer-news.britain.eu.net!newsfeed.ed.ac.uk!edcogsci!cnews
From: chrisbr@cogsci.ed.ac.uk (Chris Brew)
Subject: Re: looking for part of speech tagger and graphics library
In-Reply-To: ien@mit.edu's message of 24 Feb 1996 13:01:02 -0500
X-Nntp-Posting-Host: galloway
Message-ID: <f3tohqj4vgr.fsf@galloway.cogsci.ed.ac.uk>
Sender: chrisbr@galloway.cogsci.ed.ac.uk
Organization: Centre for Cognitive Science, University of Edinburgh
References: <199602241758.MAA02514@toxicwaste.media.mit.edu>
Date: Wed, 28 Feb 1996 10:04:36 GMT
Lines: 66

>>>>> In article <199602241758.MAA02514@toxicwaste.media.mit.edu>, ien@mit.edu (Ien Cheng) writes:
In article <199602241758.MAA02514@toxicwaste.media.mit.edu> ien@mit.edu (Ien Cheng) writes:

> I'm looking for a part of speech tagger for MCL: something that
> given an English sentence outputs the parts of speech of all the
> words. Any recommendations on which packages are worth taking a look
> at would be helpful.
Caveat: are you sure your application can live with an error rate of
anything between 1 and 10% in part of speech assignment? If so, then
several solutions are available, with (IMHO) the best one for your
needs described below.

The best available Lisp technology for this is Xerox's part-of-speech
tagger, available from ftp://ftp.parc.xerox.com/pub/tagger (as I write
I am failing to connect to this URL, but that is normal across the
Atlantic. The package includes a lexicon, a statistical model
trained on half the Brown corpus, facilities for tokenization
and the guessing of lexical entries for unknown words, and (not 
least) a portable defsystem utility which makes getting it started
fairly easy. The current version is 1.2, and I know for a fact that
an earlier version (1.1) ran well with MCL 2.0.1. It shouldn't
be too hard to get 1.2 running with MCL 3 (Has anybody already 
done this?)

If not, here's my guess about how to install.

You'll have to experiment with MCL to see what the
appropriate incantations are, but off the cuff I would guess
that

(load "pdefsys")
(pdefsys:compile-system :tdb-sysdcl)
(pdefsys:load-system :tdb-sysdcl)
(pdefsys:compile-system :tag-english :propagate t)
(pdefsys:compile-system :tag-english)

will get you into a reasonable state. As I recall, the
documentation wants you to also load the system :cl-extensions. The
point of this system is to fix up some deviations from Common Lisp
typical of older implementations. Without trying it, my guess is
that you should start by assuming that MCL 3.0 already has all
the extensions that are needed


> Ien Cheng

> --- MIT Media Lab Gesture and Narrative Language Group
I'd be interested to hear how you get on

Chris

The Human Communication Research Centre offers a free helpdesk service
for anyone considering the use of natural language software in
practical applications. No warranty attaches to this advice, though
we do our best to make it useful.
-- 
------------------------------------------------------------------
Email: Chris.Brew@edinburgh.ac.uk
Address:  Language Technology Group, 
          HCRC, 2 Buccleuch Place,  Edinburgh EH8 9LW 
          Scotland
Telephone: +44 131 650 4631 
Fax:       +44 131 650 4587
------------------------------------------------------------------


