Newsgroups: comp.ai.doc-analysis.misc,comp.ai.nat-lang
Path: cantaloupe.srv.cs.cmu.edu!bb3.andrew.cmu.edu!newsfeed.pitt.edu!gatech!swrinde!ihnp4.ucsd.edu!munnari.OZ.AU!news.hawaii.edu!pollarda
From: pollarda@Hawaii.Edu (Art Pollard)
Subject: Re: Software Knows to Deconstruct in Plain English
X-Nntp-Posting-Host: uhunix2.its.hawaii.edu
Message-ID: <DnD25H.Hy8@news.hawaii.edu>
Sender: news@news.hawaii.edu
Organization: University of Hawaii
References: <4ftflg$st9@la1.digilink.net> <4gj1nt$pc2@la1.digilink.net> <4gjuh5$m2t@anarchy.io.com> <4glpld$ef6@la1.digilink.net>
Date: Mon, 26 Feb 1996 02:12:53 GMT
Lines: 148
Xref: glinda.oz.cs.cmu.edu comp.ai.doc-analysis.misc:153 comp.ai.nat-lang:4610

In article <4glpld$ef6@la1.digilink.net>,
Intelligent Text Processing <reynolds@itpinc.com> wrote:

[SNIP]

>Naturally, this caused quite a stir at the office...  Here is the
>reply from the technical staff:
>
>Metaknowledge 
>
>InQuizit does not have knowledge of its knowledge, nor has ITP claimed
>that is has such knowledge, so that InQuizit is not designed to answer
>Slocum's first three questions.  We only have texts about AIDS, not
>texts describing those texts.

I can understand this.  However, such a question as was posed is a very
real question and would be used in a every day situational context. 
People who use your system aren't going to say "What does this system know
or not and how might I phrase my question around its potential
limitations?"  They are simply going to use your system and ask it
questions.  And here on a very real question, your system failed. 

>Timing
>
>Slocum logged on to ITP's web site at the same time as a number of
>other users.  The site is running on a single Sparc, so it has limited
>cycles available.

So, what you are saying is that Slocum's query across 16 brochures took 
up so many resources on your Unix system that in a multiuser environment 
the system's responce time is unacceptable.    I'm not sure that I want 
to commit my site (which also uses a Unix machine) to such technology -- 
we have many users most of whom want their questions answered yesterday.

BTW: Microsoft's web site (one of the most heavily used sites on the net)
is running a pentium.  Don't tell me that your Sparc station is too SLOW!

Also, if it is your sites only Sparc as you claim, how do your _real_ 
customers try out your system?

>InQuizit's coverage of English
>
>InQuizit answers six of the question which returned no information for
>Slocum.  Unfortunately the system had a bug when he dialed in.  This
>bug is now fixed and we invite users to try InQuizit again.

I read: "The system is so unstable that it is usable _only_ for research 
use.  If you want to use it to find real world solutions, you are in for 
a debugging session after which you may or may not recieve the answer you 
are looking for."  Or, "Better off avoiding this question -- better blame 
it on a bug FAAAAAASSSSSTTTTT!!!!"

>Size of InQuizit's databases
>
>Some visitors have expressed concern that the demo is too small to
>evaluate the product.  The site is only intended as a demo, not as a
>statistically-based evaluation site.  ITP has made hundreds of
>documents queriable.  When customers get to the evaluation stage, they
>are given access to other libraries of texts.
>

16 brochures are not too small, it is TEENY TINY ITSY BITSY UNIMAGINABLY 
WEENCY!!!  Most of us that deal with IR are talking hundreds of megabytes 
if not gigabytes -- _much_ larger than 16 brochures on AIDS!  I can't 
believe that you went through all that trouble to put together a web site 
to demonstrate something which looks at less than 200K of material.  What 
a waste of time.

>Ratio of number of retrievals to database size
>
>It is true that InQuizit gets a large number of retrievals for
>relatively general questions about the transmission of AIDS in this
>library, because of the content of the library.  Most of the texts
>talk about AIDS tranmission.  It is invalid to argue that there is a
>general over-retrieval problem.  There isn't.  InQuizit retrieves just
>those sections of text which are relevant to your query, with an 87%
>precision rate in one documented customer test.

People are going to ask general questions.  The very presense of FAQ's on
the USENET are proof of that.  If perchance I wanted to index all of the
USENET newsfeed for comp.compression, comp.lang.c or comp.lang.c++, using
your system, I don't want to retrieve 69% of the entire collection in
responce to a general question.  (11 returned documents/ 16 documents 
total == 68.75% of entire collection.)

Well, what about your other documented or (un) customer tests?  You can't 
just test something once and then claim that is your precision ratio.  
You need to run many tests with many different users and gather a large 
statistical sample and then with the proper (i.e., honest/correct) math, 
you will have a genuine precision ratio.  BTW: Do you let all your 
customers do your debugging and testing for you?

>
>Large-Scale Testing
>
>Intelligent Text Processing is confident that InQuizit's performance
>would be excellent in a TREC or other competition, but participation
>in such competitions requires valuable resources which ITP needs to
>deploy elsewhere.

Well, if you don't want to play the game, you don't have to.  However, 
you will gain a much better reputation if you do.  If you don't want to 
wait for TREC, I can provide you with the ftp address for several other 
standard test collections that many of us have delt with.  I can even 
supply you with the text for a number of books with which the world is 
familier such as the Bible or Moby Dick.  If you don't want to put 
forward the resources to market your product, you don't have to.  
However, I hope you are having fun because you certainly will not be able 
to eat because of your work at ITP for very long.

In the meantime, I challenge you to download a standard test collection 
and add it to your system.  It can't take more than an hour of your 
time.  If it takes longer then your system is too hard to use in a 
real world environment anyway.

As long as your claims stand, the following claim will stand: I am the 
worlds best basketball player.  I can freethrow from the freethrow line 
on the opposite side of the court and make a basket 87% of the time.  I 
can run from one side of the court to the other in 3.5 seconds flat -- no 
matter how many people get in my way.  (I simply run through them.)  
However, I will not play you or any other person because it takes too 
much time.  However, I will sell stock in my abilities for the day when I 
get the retrieval engine I am working on finished and I decide to take a 
break from programming and go play for the NBA for awhile. 

From what I have seen so far, I suspect I could slap a system together 
which:

1) Is faster than your system.
2) Has equal to or greater precision.
3) Suports natural language queries as well as yours.

How?  Well, simple:  With Lex (to remove stop words) and Grep to do the 
search and a shell script to work the two together.  How would it work?  
Well, that should be obvious enough to you that from my description you 
should be able to put one together with the same performance measures.

I hope to see your system with some real world tests and real world 
results as every other company in IR has provided their customers on the 
'net.  In the meantime, my challenge still stands for you to download a 
standard test collection (for free) and let the world see if you are 
telling the truth or not.

-Art




