From newshub.ccs.yorku.ca!ists!torn!utcsri!rpi!usc!cs.utexas.edu!uunet!psinntp!dg-rtp!sheol!throopw Tue Jun 23 13:21:34 EDT 1992
Article 6348 of comp.ai.philosophy:
Path: newshub.ccs.yorku.ca!ists!torn!utcsri!rpi!usc!cs.utexas.edu!uunet!psinntp!dg-rtp!sheol!throopw
>From: throopw@sheol.UUCP (Wayne Throop)
Newsgroups: comp.ai.philosophy
Subject: Re: 5-step program to AI
Summary: more chess stuff...
Message-ID: <5039@sheol.UUCP>
Date: 23 Jun 92 02:10:51 GMT
References: <1992Jun21.172732.10775@mp.cs.niu.edu>
Lines: 202

> rickert@mp.cs.niu.edu (Neil Rickert)
>>But look at what was said here. "Recognizing them when they occur again".
>>They have to *occur* in some sense before they can be *recognized*.
> Absolutely.  Agreement in one point.  (Regrettably that may be our only
> point of agreement).

I have the feeling that we're both (at least mostly) patiently trying to
get across the very same idea that the other already holds (but in
different words).  I may be wrong in this feeling, of course. 

> When a mathematician comes up with a new result, [...]
> the mathematics comes from something he has recognized, perhaps
> an unanticipated association between two different mathematical objects.
> His theorem documents the pattern he has recognized, and demonstrates that
> it is real and not merely imagined.  

We've just agreed above that when we say "recognized", we are noticing
that a concrete thing is an instance of an abstract thing.  That is,
there is some abstract "thing" involved, presumeably the "something [the
mathematician] has recognized".  But neither the theorem nor the proof
were known to the mathatician at the time of this "recognition", so they
can't really be instances of the "something". 

This is why I think it is premature to call what is going on
"recognition".  It is premature to the extent to which the nature of the
"something" is unknown. 

Further, I see no way that the "proof" is a documentation of a pattern. 
The proof may indeed be an instance of some proof schema (and hence
provide an instance of a pattern), or the theorem may be an instance of
some class of theorems (and hence also be an instance of a pattern). 
But neither of these seem to have anything to do with the "something"
that was recognized, so I don't see how the proof can document the
something. 

Or put it this way.  I'm apparently not understanding what Neil is
saying here very well at all.

>>It's like looking at mug shots.  You have a mug-shot book of a zillion
>>felons.  A human *could* (given time) look at all possible felons and
>>recognize the miscreant.  That's how computers look up fingerprints.
> Let me repeat.  I am *not* referring to well defined patterns that can
> be checked in a point by point comparison.  

If recognition of human faces is thought to be "a well defined pattern
that can be checked in a point by point comparison", I think a drastic
underestimation of the difficulty and subtlty of the task may be
involved.  Especially when one gets into esoteric cases of recognizing
the adult, knowing only the child. 

But the real point here, I take it, is that Neil isn't talking about
the comparison of two concrete things, but rather the discovery that
a concrete thing is an instance of an abstract thing.

> I refer to that [..comparing two concrete things..] as "pattern
> matching".  The computer check of fingerprints is pattern matching.  

Well, I must now agree that fingerprint matching might be bad example of
what I was intending, in a couple of distinct ways.  Let me abandon it. 
Think of a regular expression and a series of text strings.  A regular
expression is an abstract thing (well, technically, the instance of a
regular expression is a concrete thing at the level of "strings that
make up the regular expression language", but it *refers* to an abstract
thing...).  The strings are concrete things.  Recognizing a string as an
instance of the abstract regular expression is called "pattern matching"
or "pattern recognition" in the computer biz, more or less
interchangeably. 

> Let me give an example.  Suppose I want to write a spell checking program.
> One way of doing it would be to use pattern matching.  I could read in the
> word, then compare it with 100,000 words of a dictionary to see if it
> matched.  This is somewhat expensive, but if binary search methods are used
> the cost can be tolerable.

But to me, that isn't *pattern* matching.  That's just *matching* (or
comparing).  (Of course, in the extreme case, the two blur into each
other, but I'd still say that comparing two concrete things is
*matching*, and noting an "is-a" or instance relation between a concrete
and an abstract is *pattern* matching.)

> OR  I can create a finite state machine which accepts just the words of
> the dictionary.  As I read in the word, I run it through the finite state
> machine.  If the word is 7 letters long, it takes only 7 steps to check the
> spelling.  I recognize it, but I am not doing any pattern matching.

OK, THIS is'd call pattern matching, or pattern recognition.  The FSM is
an abstract representation of the class of sentences it recognizes, and
the word being checked is a concrete character string.  (It is worth
reinforcing the above point that "matching" and "pattern matching" blur
into each other here.  As we enhance the matching case with binary
searches, tree searches, hash searches, and other exotic encoding
schemes, and as we degrade the FSM by making it nondeterministic, or
representing classes of states with substrings, or whatnot, there comes
a time when the enhanced matching is the same thing as the degraded
pattern matching (or recognition)). 

But anyway, I may be making some progress understanding now.  I'll read
"pattern matching" as "comparing two concrete things", and "pattern
recognition" as "comparing an abstract thing to a concrete thing", and
see how far I get. 

> [...] we too readily think of pattern matching as if it were the
> only way to do recognition.  This leads to many fallacious ideas.  It
> leads to people counting how many steps the pattern matching would take
> in a computer, multiplying this by the reaction time of neurons, and
> attempting to deduce the degree of parallelism in the brain.  But it is
> all based on a probably false model.  

Presumably, this is meant to represent the argument I made earlier about
the brain not having the time or space to do pattern recognition.  But
note, I wasn't counting comparison (or match) steps in a pattern
recognition, I was counting *how* *many* *pattern* *recognitions* must
occur, and counting a single pattern recognition as unit cost. 

Again, pattern recognition is comparing an abstract to a concrete.
If the "goodness" of a chess position is a recognition of a pattern
*in* that chess position, then one still has to perform a recognition
step on *each* possible chess position generated in some way.  Too
much time required.

My second guestimate was based on the notion that perhaps the pattern
recognition was a pattern in the "reachableness" of chess positions as a
whole, as well as in the "goodness".  Here, the concrete object being
matched (that is, the tree of possible future chess moves) is too large
to fit as an input to a "neural net" type pattern recognizer. 

Now.  At this point, I'm perfectly willing to suppose that a recognition
IS what's going on.  But I'm still in the dark as to what Neil thinks
the abstract thing and the concrete thing *are* that result in a new
move.  I imagine it is something like "in games with positions like this
one, I've won in the past by making a move like so", so the match is on
the current position, and the abstract thing is somehow tied to the
"right move". 

Possible.  That would be the "I only look one move ahead: the right
move" model.  What I was trying to get at is that there is some
indication that that's not how it works. 

One indication is that early computer chess players tried essentially
this.  That is, what was called the "heuristic" approach.  Various
patterns in board positions were abstracted, and the "right move"
attached to each abstraction.  Then play proceeded by recognition on
board position yielding a move.  They were abandoned when they didn't
keep up with search-driven methods.  This is a weak indication, of
course, since it might just be that human pattern recognition of
position-to-move is much more subtle, or that computer speeds improved
faster than programmer cleverness.  (Obviously, "expensive evaluate,
little depth" and "cheap evaluate, great depth" are ends of a spectrum
and can blur into each other.  So-called "hybrid techniques" were also
tried, and all manner of complications abound, but everything here is
being oversimplifed so I won't feel too ashamed.)

Another indication that this isn't what's going on is that I still think
there's a deep connection between computer play and mathematics. 
Mathematicians don't just do a pattern match on the current state of the
art, and then propose a single proof step, and repeat.  They have much
deeper goals, and skip with strides of tens, hundreds, and sometimes
thousands of proof steps to yield interesting conjectures. 

But maybe there's not a deep connection there.  Or maybe the
"skipping ahead" is just an illusion.  Or something.

Who knows.  As I said, "I'm only an egg".

> If you had a computer which could do recognitions, but held no stored
> images, you would have to program it to simulate positions and attempt
> to recognize them.  You would find it you were programming it in much
> the same "wrong way" approach.

No, I don't think so.  But maybe we are "agreeing in different words"
here.  Let me try again.

The computer has a abstract thing "in mind".  A pattern of what
constitutes "goodness" in a chess position, encoded as (say) an FSA, or
neural net, or whatnot.  It then generates zillions of concrete things,
namely reachable chess positions.  It then prunes away those positions
that lack "goodness".  It is left with positions sometime "far" in the
future of the game that are "good" positions.  It then chooses the
move that these positions share as a "first move to get there".

That's the "front way" or "obvious way" I was talking about.  The
computer has no "stored images", no "matching" or "comparison" is going
on.  Rather, it has a "recognizer" of cost "1" on chess positions.  But
*reachable* chess positions it must labor at. 

The human, so-called "wrong" way around arives at the set of positions
far "too soon", and hence I think there is some magic there ("magic"
being "as-yet-not-understood mechanism").  It may be that humans just
start with a more complicated "abstract thing" that subsumes both
"goodness" and "reachability", and pattern match on that.  I argue that
"reachability" is too big to fit, unless there are currently unknown
regularities or topography to it that allow it to be "folded up" or
"summarized".  IF reachability can be "folded" or "summarized", then
this would be compatible with Neil's position, I think. 

My position, then, is that I currently think some unknown method is
being used to "skip" the pattern recognition over reachable chess
positions, allowing humans to "start at the end" and work backwards. 
This method may turn out to be a case of pattern recognition over
something other than chess positions.  Maybe. 
--
Wayne Throop  ...!mcnc!dg-rtp!sheol!throopw


