Newsgroups: comp.ai.philosophy
From: Lupton@luptonpj.demon.co.uk (Peter Lupton)
Path: cantaloupe.srv.cs.cmu.edu!das-news2.harvard.edu!news2.near.net!news.mathworks.com!news.duke.edu!news-feed-1.peachnet.edu!gatech!howland.reston.ans.net!pipex!demon!luptonpj.demon.co.uk!Lupton
Subject: Re: Information And Entropy (models)
References: <39n4at$2kg@seagoon.newcastle.edu.au> <383svn$js9@galaxy.ucr.edu> <1994Oct20.214734.15940@forte.com> <9411041725.PN18337@LL.MIT.EDU>
Distribution: world
Organization: No Organisation
Reply-To: Lupton@luptonpj.demon.co.uk
X-Newsreader: Newswin Alpha 0.6
Lines:  119
Date: Thu, 10 Nov 1994 22:12:23 +0000
Message-ID: <683524865wnr@luptonpj.demon.co.uk>
Sender: usenet@demon.co.uk

There is a relationship between computation and physical entropy
which has been worked out by Landauer, Bennett and Zurek.

Landauer asked the question: how much entropy does computation take?
He observed that computations involve operations (such as STORE)
which are many-to-one (or even 2-to-1). Such operations, he observed,
were bound to produce physical entropy (one bit's worth in the
2-to-1 case). Other operations (such as EXCLUSIVE OR) are 1-to-1
and so could be performed reversibly, without change in physical
entropy. Other operations (such as MEASURING from the environent)
are computationally 1-to-many and so could be performed with LOSS 
of physical entropy.

Landauer then observed that the fundamental operations of the computer
could all be made 1-to-1 (and so reversible) by adding extra outputs.
These outputs could be saved on a garbage tape. The result was a 
computation which proceeded reversibly - only the clearing of the 
garbage tape would result in increase of entropy.

Bennett then observed that the output of the computer could be copied
and the garbage tape cleared just by running the computation backwards.
In effect, Bennett "lifted" the question of reversibility of 
computation from individual operations to the level of the mathematical
function as a whole. In essence, a computation would generate 
physical entropy if it was many-to-one, could be done reversibly if 
1-to-1 and could reduce physical entropy if 1-to-many. (of course,
real computations generate *enormous* quantities of entropy!)

This naturally lead to the idea that entropy should be redefined:
instead of just considering physical entropy, add the entropy of
the computer's state. But how? 

Imagine a sequence of bits. This can, through the action of the 
computer, take many forms. From the point of view of physical
entropy, any 1-to-1 computation could be done reversibly and
so would leave the entropy of the data unchanged. A good idea
would be to squeeze the data as much as possible - the data
compression problem. This is re-inforced by the observation by
Per Martin-Lof that maximally compressed data (the algorithmic
complexity of that data) produces a random sequence. What could
be nicer? This takes us straight away to Algorithmic Complexity 
as being the approriate measure for the entropy of data in
computer memory.

If we now consider a typical information-theoretic situation,
we see that a source produces sequences of digits according to
some distribution (usually some non-deterministic finite state
machine acts as the source). The Algorithmic Complexity of such
sequences can be defined and the mean taken over that distribution.
The result is, not unexpectedly, that the  mean algorithmic
complexity per digit is the Shannon Information of that source.
(This is actually an approximation which gets better and better
as the length of the sequence is permitted to increase).

What we see is that Algorithmic Complexity can be seen as a
generalisation of Information from distributions to individual
sequences as well.

The following quote:

> There is a widespread view that the two concepts are complementary.
> Information is sometimes called negentropy. If you are careful about
> your information corresponding to physical states, then one bit
> will be at least kT. Leon Brillouin, _Science and Information Theory_
> is a pretty standard work on this. The second edition was published
> in 1962. Brillouin points out that Shannon information cannot be
> entropy, since it can decrease when passed through a passive filter.
> Entropy, on the other hand, cannot decrease spontaneously, except
> by the wildest chance. Shannon's definition was unfortunate, but
> no more unfortunate than the original definition of entropy by
> Clausius. 

would seem premature. The observations above show a straightforward
link between physical entropy, algorithmic complexity and information.
If a passive filter (or channel) reduces the infomation as measured
by Shannon's formula, we must suppose that the entropy of the data
is being converted by the channel into physical entropy. After all,
the channel is behaving in a many-to-1 manner which, according to
Landauer, results in generation of physical entropy. It would seem
as though Brillouin is just mistaken - the information-reducing 
channel will slowly warm up.

Zurek applies these observations to the computerized version
of Maxwell's Demon. The demon computer measures the path 
of gas particles (or some-such). If careful, the Demon Computer
can perform 1-to-many meaurement-type steps by interacting with 
the gas. Physical entropy is being lost from the gas and appears 
as algorithmic complexity in the computer's memory! There is no
overall decrease in entropy but there can be a movement of 
entropy from the gas to the demon computer's memory.

Such Demon Computers can act as refrigerators and, indeed, CERN
has such a device to cool particles in a storage rings.
The device measures the position of the particles in the ring
and gives them a kick in the right direction. The device has
another interesting feature - the time required from measurement
to kick is critical because the particles are moving close to the
speed of light. The control signal takes the path across the 
diameter, whereas the particles have to take the long route around
the circumference.

The relevance of all this to c.a.p comes when we ask what a demon
computer is really doing. It is, after all, interacting with its
environment - making measurements; predicting what will happen;
changing what occurs. In this way we can see the Demon Computer
as a cognitive process and, indeed, we can imagine an AI system
in the place of Maxwell's Demon. What we see is that the information
the system gets about the environment is measured by the algorithmic
complexity of the computer's memory. It is a further claim that
the very process of induction (which, plainly, the Demon Computer
will have to engage in) is the process of finding concise 
representations of the data and using those representations to
make predictions. What starts out as a discussion about the
entropy of computation ends up as a discussion about the relationship
between induction, algorithmic complexity and entropy.

Cheers,
Pete Lupton
