Newsgroups: comp.ai.philosophy
Path: cantaloupe.srv.cs.cmu.edu!das-news2.harvard.edu!news2.near.net!howland.reston.ans.net!gatech!newsxfer.itd.umich.edu!zip.eecs.umich.edu!newshost.marcam.com!insosf1.infonet.net!internet.spss.com!markrose
From: markrose@spss.com (Mark Rosenfelder)
Subject: Re: Turing test (was Penrose and Searle)
Message-ID: <D05IFt.CwK@spss.com>
Sender: news@spss.com
Organization: SPSS Inc
References: <38tqh6$5qk@percy.cs.bham.ac.uk> <3bfphr$6sj@news-rocq.inria.fr> <D03Lp9.L4H@gpu.utcc.utoronto.ca> <hubey.786306282@pegasus.montclair.edu>
Date: Thu, 1 Dec 1994 21:31:03 GMT
Lines: 54

In article <hubey.786306282@pegasus.montclair.edu>,
H. M. Hubey <hubey@pegasus.montclair.edu> wrote:
>Maybe those who are anti-TT [in a manner of speaking] are trying to
>question the relativity of TT. I don't see much of a problem here
>that statistical testing could not solve.
>
>for example, in statistical testing of a hypothesis you can have
>four outcomes [could apply to humans too, stretching the idea].
>
>11	We call it True(1) when it is in fact True(1)
>10	We call it True when it is False  (Error)
>01	We call it False when it is TRue  (Error)
>00	We call it False when it is False
>
>The errors are called Type I and II errors in statistical hypothesis
>testing. The other two are correct. When this kind of possibility is
>appplied to the TT (or humans passing judgements about each other)
>then it's possible that some people could erroneously judge the
>computer to be human when it isn't or judge a human to be a computer.
>
>But that is exactly the strength of the Turing test. It's a statistical
>test. If a machine can pass for human, then how could we say it isn't
>intelligent. The loopholes have been plugged by the TT. 

I hope you know more statistics than this.  The fact that a given test cannot
distinguish between men and humans does not prove that there is no 
difference to find.  It may be that the test is lousy.

There are statistical tests to show whether or not a measuring system is
capable of making the discriminations asked of it.  (I helped write a
product here that makes such tests.)  Any measurement process will have
some fuzz-- some randomness in its results.  If the spread of this fuzz
is large enough in comparison with the thing measured, the measurement
process is useless.  

We don't know if the measurement errors in the Turing Test are enough to
swamp the results-- and that's a problem right there.  But the known false
positives-- the people who judged ELIZA and the Loeb entries as human--
suggest that the fuzz is indeed very large.

>So far no one
>has even come up with a better alternative. 

Has anyone tried?

>What would be really 
>interesting is if some one wrote a program that takes those standardizes
>IQ tests that pschologists seem to be fond of giving to humans :-)..

Programs designed to read and solve algebraic word problems were written
back in the '50s.

As a generalized test of intelligence, IQ tests are even worse than the 
Turing Test; the notion of "intelligence" they cover is ludicrously narrow.
