Newsgroups: comp.ai.philosophy
Path: cantaloupe.srv.cs.cmu.edu!das-news2.harvard.edu!news2.near.net!news.mathworks.com!hookup!swrinde!howland.reston.ans.net!news.cac.psu.edu!news.pop.psu.edu!hudson.lm.com!newsfeed.pitt.edu!uunet!psinntp!scylla!daryl
From: daryl@oracorp.com (Daryl McCullough)
Subject: Re: What's Grammar, Anyway?
Message-ID: <1995Feb23.142706.3877@oracorp.com>
Organization: Odyssey Research Associates, Inc.
Date: Thu, 23 Feb 1995 14:27:06 GMT
Lines: 160

adjih@photon.remcomp.fr (Cedric Adjih) writes:

>  The primordial purpose of a grammar is the efficient transmission of 
>information.

I agree. In my article, I broke the aspects of efficient transmission
into 

     (1) Having structure (redundancy) that allows us to deduce the
         meaning of an utterance even in the presence of noise or unknown
         words. This is what I mean by "error-correction". I know
         in my original article, I only talked about the need to
         recover from noise, but the same sorts of error-correcting
         mechanisms in a language also allows a listener to understand
         the meaning of a sentence even when some of the words are
         unfamiliar. 

     (2) Having structure that supports a speaker in the creation of
         utterance expressing an infinitely rich variety of meanings.

>To understand this, try to imagine a language that developed.
>If you were a Cro-Magnon, there might be only a relatively small number
>of things you would ever say : "Mheef" (I'm hungry), "Mhaaf" (I see 
>comestible animals coming in), "Mhiif" (let's go west), "Mhuuf" (Hell,
>where is Oog playing again ?). Why not just have one word for each 
>different thought ? Who needs sentences anyway ? Well, there are problems. 
>There is always others' (or yours) stupidity that make them not understand
>that "Mhuuf" means "Hell, where is Oog playing again?".

This is a good point. But as I said above, "error-correction" handles 
both the case of understanding in the case of noise, and also understanding
in the case of unknown words.

>And to remind
>them of the meaning of "Mhuuf", you make signs :
>  Since it happens that Oog has a blue gem around his neck, you show
>the (only) other blue gem there is. Then, since Oog spends his playing 
>time turning cartwheels, you do so. And endly to show that you want to
>know "where", you mimic someone looking for something, turning your head
>right to left, left to right, watching intensively...
>
>  If each word expresses a complete different thought, and you don't
>understand the word, then you will have to mime the thought. This
>is painful, because when you want to talk about Oog, you may have to 
>show a blue gem, and possibly get it when it is 300 meters far.
>  Then you get an idea : you bring the tribe together and alternatively
>show the blue gem, and say "Mhoof". Now each time you want to ask 
>"Mhuuf", you can start the mimicry by saying "Mhoof" instead of
>showing the blue gem. If you are lazy or too old you can even define 
>new terms for "turning cartwheels" (Zaaf) and looking for (Zeef).
>Thus when in response of "Mhuuf", you get "I can't understand a word
>of what you are trying to say", you can still ask "Mhoof Zaaf Zeef".
>
>  Shortly after that, you'll probably give up the use of "Mhuuf", 
>since it is too complex a thought for someone to remember.
>
>  And after some time, the tribe would assymptotically end up with a 
>language whose words are describing _simple_ and _easy to show_
>entities. The tribe would also gain a naming scheme for its members,
>by analogy or association (Oog="Mhoof"; <Little Flower>, <Witty Bison>,
><Fat Ape>,...)

I agree completely. When I said "the transmission of information in
the presence of noise", that was too narrow. Imperfect memory and
imperfect knowledge of the vocabulary are two impediments to
communication that must be compensated for as much as noise is.

>  IMHO, the theory "apparition of grammar to allow error-correction" 
>has some flaws :
>
> - first, it is overkill, because if error-correction is so important you
>can still always repeat your phrase twice (or more) : it doesn't 
>require you as much intellectual effort as the use of a grammar.

That is certainly a strategy, but there are flaws with repetition as
an error-correction code. One problem is that it doubles the amount of
time it takes to say everything. The second problem is that repetition
is of no help whatsoever with an unknown word. The structure that
grammar provides allows us to get the basic meaning of a sentence even
when several parts are lost due to either noise or lack of familiarity.

> - second, if error-correction was the major factor in language evolution,
>I would expect _individual_ words to be the less ambiguous possible,
>i.e. to be long. The reverse is true : frequently used words are short
>(and strongly increase the probability of error) : in French, English,
>German, the pronoms "I", "you",... ; the verbs "to be", "to have" are 
>one- or two-syllabic, just like most of the frequent words.

I don't see how the error-correction idea conflicts with the idea that
frequently used words have short expressions. Because they are so
frequently used, people know precisely when to expect such words to
appear, and so it takes very little signal to unambiguously identify
them. Suppose that you hear someone singing very softly from far away.
If it is an unfamiliar song, the chances are good that you won't be
able to make out the individual words. On the other hand, if it is
a song that you know very well, then you will find that it is not
very difficult to follow the singer. You can easily figure out which
verse she is singing. Error correction is less necessary with familiar
words.

But I agree with you that it is efficiency (of both transmission of
information and correcting for missing information) that accounts for
how redundant words are. According to Shannon's analysis, the length
of a word in a perfectly efficient language would, in the absence of
noise, be proportional to log(1/P), where P is the probability of the
word appearing. So viewing language as a code in Shannon's sense
accounts for why common words are short.

> - third, I think that if one prononces individual words, the average
>human would still understand most of them (with no grammar).

There are layers of structure in a language. Individual words have
structure in terms of patterns of consonants, vowels, stress, etc.
This "grammar" of individual words is what makes a word sound
recognizably French or English even when the word is unfamiliar to
the listener.

>It seems easier to lessen the number of voyels or phomens if there is
>too much error. (Why aren't there 100 different voyels ? IMHO because
>of errors/ambiguities).

I agree. It is more efficient and less error-prone to build up words
out of a small number of phonemes, and build up concepts out of a
(relatively) small number of words than it would be to have separate
sounds for each concept.

>  Well, what I think is essentially the reverse : the infinite variation
>in the content has favored the emergence of grammatical constraints that
>allow to communication to be efficient (in term of length of sentences and
>length of learning). The error-correctiveness is a lucky side-effect.

Well, here we disagree slightly. I think that there is a feedback
effect, rather than strict cause-and-effect. Having a complex content
makes a grammar of a certain complexity necessary, but *also* having a
grammar allows the expansion of the content. As far as I know, most of
the basic grammatical features of modern language were pretty much in
place thousands of years ago, although what people have to talk about
is very, very different today. Using the same noun/verb/direct-object
structure, we can talk about quantum mechanics as well as hunting wild
boars. There was no evolutionary need for a grammar that was capable
of talking about quantum mechanics, but the grammar is so very
flexible that it can be used to talk about *anything* (actually, maybe
there are some things that language doesn't allow us to talk about,
but we don't talk about those things.)

I don't agree that error-correction is simply a lucky
side-effect. Noise in communication (interference from other sound
sources, variation in pronunciation from person to person, and
variation from person to person in the vocabulary used) is always with
us. That is the reason that automatic speech recognition systems often
fail so miserably even when a human listener can understand perfectly
well what is said. Of course, humans use both grammatical and semantic
knowledge to correct for noise, and so it is an overstatement to say
that grammar should alone get the credit.

Daryl McCullough
ORA Corp.
Ithaca, NY


