Newsgroups: comp.ai.nat-lang
Path: cantaloupe.srv.cs.cmu.edu!das-news2.harvard.edu!news2.near.net!news.mathworks.com!udel!gatech!howland.reston.ans.net!pipex!uunet!bcstec!bronte!snake!rwojcik
From: rwojcik@atc.boeing.com (Richard Wojcik)
Subject: Re: best parser???
Message-ID: <1994Nov21.182044.28499@grace.rt.cs.boeing.com>
Sender: usenet@grace.rt.cs.boeing.com (For news)
Reply-To: rwojcik@atc.boeing.com
Organization: Research & Technology
References: <MAGERMAN.94Nov15175620@platypus.bbn.com>
Date: Mon, 21 Nov 1994 18:20:44 GMT
Lines: 42

In article 94Nov15175620@platypus.bbn.com, magerman@bbn.com (David Magerman) writes:
>To answer your question briefly, accurate parsing of real text exists
>only our imagination.  Parsing accuracy is hard to measure, but
>suffice it to say that on a task like parsing WSJ, the best parsers in
>the world are below 50% sentence accuracy even by crude measures, and
>*FAR* below 50% on more stringent measures.  Also, the most accurate
>and detailed parsers are terribly slow.

What are the "measures" that you are referring to?  It is impossible to assign any
meaning to these percentages without some clear notion of what you mean by
"accuracy".  If you are looking for a parser to get exactly the intended parse
tree in every case, then I would say the 50% figure means that you are working
with a fairly restricted corpus (not WSJ) or fairly shallow parse trees.   How does
"accuracy" translate into precision and recall?

>But if someone tells you they have a parser which parses the WSJ with
>95% accuracy, don't buy it.

I don't think anyone is foolish enough to make such a claim, but parsing accuracy
turns out to be of little interest in the real world.  Ultimately, what we are 
interested in is precision and recall with respect to information extraction.  
I see accurate parse trees more as a side effect than a goal or language
processing.  That is why we looked at precision and recall for error critiques
in our ACL '93 paper.  Our (GPSG-based) parser-grammar makes errors in a high
percentage of the sentences that it processes, but most of the errors (e.g. 
attachment errors) had little or no effect on the information that the system
was scanning for--i.e. compliance with a writing standard.   And, luckily for us,
our corpus (aircraft maintainance procedures) contains language that is largely
procedural in nature.  The accuracy of the system goes down somewhat when you
parse descriptive text, which tends to have longer, more complex sentences.

It is certainly true that the accuracy of information extraction depends on the
accuracy of the underlying parser/grammar to some extent, but  the requirements
of the application do have an effect on what is most critical in terms of parsing
accuracy.  

---

Disclaimer:  Opinions expressed above are not those of my employer.

    Rick Wojcik   (rick.wojcik@boeing.com)   Seattle, WA

