Newsgroups: comp.ai,comp.ai.philosophy,comp.ai.alife
Path: cantaloupe.srv.cs.cmu.edu!rochester!cornellcs!newsstand.cit.cornell.edu!portc01.blue.aol.com!newsxfer2.itd.umich.edu!howland.erols.net!netcom.com!jqb
From: jqb@netcom.com (Jim Balter)
Subject: Re: rand() - implementation ideas [Q]
Message-ID: <jqbE0Dvyr.A9H@netcom.com>
Organization: NETCOM On-line Communication Services (408 261-4700 guest)
References: <54lr8o$ndm@nntp.seflin.lib.fl.us> <3279fbac.0@news.iea.net> <jqbE080Gu.HrM@netcom.com> <327e9a12.0@news.iea.net>
Date: Tue, 5 Nov 1996 06:13:38 GMT
Lines: 98
Sender: jqb@netcom23.netcom.com
Xref: glinda.oz.cs.cmu.edu comp.ai:41884 comp.ai.philosophy:48314 comp.ai.alife:6827

In article <327e9a12.0@news.iea.net>,
Steve McGrew <stevem@comtch.iea.com> wrote:
>        You misunderstood.  I am not looking for an argument; I am hoping for 
>an explanation so I'll understand something I do not understand yet-- and 
>there are many such things!

That's good, because there is no argument to be had.  It's a theorem, not my
opinion; there's nothing you can do about it.  You were *given* explanations,
*and* references.  But if you don't want to be seen as merely arguing, then
stop ignoring what has already been said.

>        Suppose you send me a long sequence of numbers you tell me has been 
>generated at random.  Suppose I find in it a string of 100 contiguous 1's.  I 
>should be able to re-send that sequence in somewhat compressed form, as long 
>as you know my compression algorithm in advance.  It won't happen very often-- 
>but sometimes it will.  And on those rare occasions you can get a little bit 
>of compression.  

It has already been explained at least twice that any such encoding creates an
ambiguity; an algorithm that encodes 1111 as 13 must also encode 13 *somehow*;
*that* encoding will be an *expansion* of 13, or displaces some other string
which must be encoded; somewhere along the line there is an expansion.  This
is why it is called a pigeonhole argument; you can't get n pigeons into n-1
pigeonholes, no matter how much scrambling of pigeons (as opposed to their
eggs) you do.

>        If there is a theorem that takes into account the need to communicate 
>the algorithm in advance, or to send it along with each message, it is 
>immediately plausible to me that there will be *on the average* no way to 
>compress a lot of random sequences.  

The same algorithm must apply to all sequences, which means essentially being
*known* in advance (but not transmitted, *as has already been said* (is that
too loud?)).  Otherwise, every sequence could have its own algorithm that
translates that sequence to a single bit (e.g., "if input is S output 1, else
output 0, an encoding of the input length, and then copy the input to the
output").

That there is no way "on the average" to compress an arbitrary sequence is
shown by the pigeonhole argument.

>        I am curious to know how the theorem is stated, and its proof.  Can 
>you refer me to it?

Someone published the pigeonhole proof here the other day.

>I'll try the FAQ you mentioned.  I gave up on trisecting 
>the angle (with a compass and straightedge, not with mechanical linkages which 
>make it easy) and perpetual motion machines (that generate power) 36 years 
>ago.  I gave up on universal compression the first time I thought about it.

A procedure that can replace repeated sequences on rare occasions without
producing compensating expansions or creating ambiguities is a universal
compression algorithm; it has a net compression effect.  No algorithm can have
such an effect.  (An algorithm that has a net compression effect but in fact
expands some sequences can be replaced with an algorithm that has a net effect
but doesn't expand any sequences, but translating the expanded sequences into
the holes left by the compressed sequences.)

>I 
>did not give up on trying to understand things I don't understand yet, nor 
>decide to believe whatever I'm told by the loudest, most forceful voice!

I gave explanations and references, urged knowledge over ignorance, and
suggested that the discussion is out of place in this forum.  I didn't
transmit any high volume audio, but I would hope that logic is seen as
forceful.

There are some things that some people won't understand unless or until they
develop a proper way of looking at things, but no one has any obligation to
tutor them.  I familiar with the subject, the logic, and the literature in a
way that you are not.  I'm not arguing with you, I am pointing something out;
you are free to ignore it.  But there are no net compression algorithms, you
can't write one, it's a theorem, it will remain a theorem whether you
understand it or not, whether you believe it or not.  *I* certainly don't need
for you to believe it.

>        This all came from the thought that you would not really have a random 
>sequence of numbers if you took out all the compressible numbers.  I'll stand 
>by that intuition, until someone gives me a clear proof it's not so.  : )

If you take out all the compressible numbers, you have only random numbers.
Read Gregory Chaitin's work, referred to previously.  But you probably mean
something by these terms other than their technical meanings, although I
suspect that you would have trouble giving them coherent alternate meanings.
If you mean to remove from a sequence all contiguously repeated subsequences,
the only possible sequences you could have left, if this refers to binary
sequences, are 0, 1, 01, 10, 010, 101, and the empty sequence.

As a general strategy, I suggest holding all intuitions as suspect in the
absence of a demonstration (proof may be hard to come by) of their validity.
You are better off not holding as true that which is true than holding as true
that which is false.


-- 
<J Q B>

