CMU Artificial Intelligence Repository
Home INFO Search FAQs Repository Root

Brill: Trainable Part of Speech Tagger

This directory contains Eric Brill's trainable rule-based part of speech tagger. This tagger is based on transformation-based error-driven learning, a technique that has been effective in a number of natural language applications, including part of speech and word sense tagging, prepositional phrase attachment, and syntactic parsing. The code includes a tokenizer for ASCII English, an English lexicon enduced from the Brown corpus, a table of mappings for word suffixes to likely ambiguity classes, and an HMM trained on the odd numbered sentences in the Brown corpus. For more information, see chapter 6 of Brill's thesis.

Version: 1.13 (21-JUN-94) Requires: Common Lisp Copying: Copyright (c) 1993 by MIT Use, copying, modification, and distribution permitted. CD-ROM: Prime Time Freeware for AI, Issue 1-1 Mailing List: If you wish to be on the mailing list for future releases, bug reports, etc, please send mail to the author. Author(s): Eric Brill or Keywords: Authors!Brill, Error-Driven Learning, HMM, Lisp!Code, Machine Learning, NLP, Parsing, Part of Speech Taggers, Taggers References: ?
Last Web update on Mon Feb 13 10:27:02 1995