From newshub.ccs.yorku.ca!torn!utcsri!rpi!usc!elroy.jpl.nasa.gov!swrinde!mips!darwin.sura.net!jvnc.net!nuscc!ntuix!eoahmad Tue Jul 28 09:41:19 EDT 1992
Article 6445 of comp.ai.philosophy:
Path: newshub.ccs.yorku.ca!torn!utcsri!rpi!usc!elroy.jpl.nasa.gov!swrinde!mips!darwin.sura.net!jvnc.net!nuscc!ntuix!eoahmad
>From: eoahmad@ntuix.ntu.ac.sg (Othman Ahmad)
Newsgroups: comp.ai.philosophy
Subject: Information Theory of AI
Message-ID: <1992Jul14.121547.2038@ntuix.ntu.ac.sg>
Date: 14 Jul 92 12:15:47 GMT
Organization: Nanyang Technological University - Singapore
Lines: 179

Here is a comment I gave about the comments that the editor of SigArt, Lewis
Johnson, gave to my submission of paper tittled Quantitative Measure of 
Intelligence.  The comments are good tutorial on Information Theory for
 AI researches/students. It also gives me the opportunity to know the
opinion of AI experts on this method.

I also submitted the same paper to other places but they are more classical
textbook style Information Theory without relevance to current AI theory.
I am basically a Communication Engineer. I stumbled on this method of measuring
intelligence when analysing the Information Flow of a Microprocessor.
Today I got a letter of acceptance from Prof. Osamu Hirota, Chairman of
ISITA(International symposium on information theory and its applications).


Thank you very much for bothering to give comments. You are the first
one who has actually done so. None of the comp.ai.philosophy has actually
criticised it like you do.

If you do not mind the time, I wish you could consider my counter-comments.
I am not appealing for consideration. I just would like to clarify certain
points.
	Please understand that I am a communication engineer not an ai expert.
I have no interest to pursue this matter further. I would rather build 
machines rather than investigate and argue about definitions.
	I just feel impotent if I cannot measure anything concrete such as
giving it units of measurements such as volts and bits.
	The theory as it is, is already useful for me in computer architecture.
Just as information theory is already useful in communication although a lot
of people still argue about its validity.

 
>Comments on paper:
 
>If what you are proposing were indeed a valid quantitative
>measure of intelligence, it would make predictions that could be
>compared against accepted qualitative measures of intelligence, as
>well as existing quantitative measures such as intelligence test.  You
>have done neither.  The reader has no basis for evaluating your
>approach.

I have no interest in pursuing it further. My main interest is to introduce
information theory for consideration by AI experts.
	Otherwise I would have prepared it for journal publication. Instead
I prepared it for letter/correspondence only. I was hoping that the AI 
investigators would bother to confirm/deny/modify my theory 
because they have all the source codes and data for their intelligent programs
such as expert systems.

>For a start, I suggest you compare your stance against that
>taken by Allen Newell in Unified Theories of Cognition, published
>by Harvard University Press.  Newell is a recognized authority
>on cognitive science, and in his book he proposes a definition
>of intelligence, i.e., the ability of an agent to act in an
>uncertain environment in order to achieve its goals.  This

Excuse me. Isn't uncertainty the same as unpredictibility?
According to Webster's New Dictionary and Thesaurus for school, home and office:
certain:sure,convinced;"sure to happen";regular, inevitable;indisputable; ......
predict:to foretell;to state(what one "believes will happen"); ..........

Can't we quantify the "ability of the agent" by measuring the "uncertainty of
the environment" in order to achieve just one particular goal/or set of goals?
	We can measure the uncertainty using information theory.  The beauty
of using information theory as a definition is that it is exact. However, to
measure it would incur measurement error. We must resort to statistical
techniques to minimize the error.
	Let me give you an example based on Allen Newell's definition of
Intelligence. Has he defined a way to quantitatively measure this Intelligence?
	An agent wants to achieve a goal in an uncertain environment. In order
to achieve it he must do something. He must have alternatives. Otherwise it
would be impossible for him to achieve the goal. If he cannot even generate
the alternatives then he must be stupid. No point in measuring his intelligence.
The more alternatives he has the more decision making he has to make.
	The number of alternatives which are required to solve a goal in an
uncertain environment would indicate the uncertainty of the environment. The
more alternatives, the more uncertain the environment. The goal is just one
of the alternatives. The uncertainty is just by "definition" to be log base 2
 of the number of alternatives. Let us say this is INTL bits of uncertainty.
	This definition is the INFORMATION THEORY stated in a different way
to make it clearer to students of AI. Why log base 2?
	If the agent can generate all alternatives in just one step, then we
may say that he has generated INTL bits of intelligence, because he has managed
to reach his goal in an enviroment with INTL bits of uncertainty.
	The above paragraph is just a standard measurement theory. For example,
the horse power is defined by the rate that a particular horse can drag
a particular weight. If an engine can drag the same weight at the same rate, 
then that engine has 1 horse power.
	He may just generate 2 alternatives at one step. He would therefore need
many steps to come to the goal. If there is no loss in generating alternatives,
he would need INTL steps.
	For example: let INTL =3 so there are (2 to power of 3)=8 alternatives.
Each step he can eliminate half of the alternatives because the solutions must
be possible. The alternatives are stored in his knowledge data base.
If he can't eliminate half of the alternatives,
 then there is loss. We are making an assumption that there is no loss. This
loss reflects how efficient the decision tree is built up. So he must take
3 steps to reach his goal. On the other hand the decision tree may not be
balanced, leading to other deviations.
	Log of 2 to base 2 is 1. So he can generate intelligence of 1 per step.
His capability for intelligence is 1 bit per step.

For each step he adds his intelligence. When he reaches his goal, he has used
up 3 intelligence bits, which is the same as the uncertainty of the environment.

The amount of intelligence required are 3bits, the uncertainty of the environment.
The amount of intelligence used by agent is 3bits because there is no loss.

Because we are using logarithmic scale, we can just
add all them up. If we use number of alternatives as units, then we must
use division or multiplication.

	I am sorry that Allen Newell book is not available at our library now. I would
try to get it through inter-library loan. If you could give me his email address
it would indeed be very helpful.
	My theory does not contradict that definition. It would just assign
units in bits to the intelligence defined by Allen.

>view contradicts your view of intelligence as unpredictability.
Maybe you have misunderstood my words. In fact it is the ability to generate
unpredictable sequences of instructions.

>Although the individual actions of an intelligent agent are
>unpredictable, the overall outcome of an agent's actions can
>be predicted, given knowledge of the agent's goals and the
>agent's environment.  That is, intelligent agents with the
>same goals can be expected to behave in such a way that same
>goals are achieved.  Their behavior will be far from random.
	My discussions above assume that we have full access to the 
knowledge data base and the agent's ability to test alternatives.
	What happens if we just observe from outside? If the environment
for the goal were very uncertain, then we would observe that the agents,
generate various alternatives which the observer cannot be certain of.
If the observer can only have access to one time slice equivalent to the
time slice that the agent generate one alternative, then we would have to
add the intelligence units measured at each step to come up with a method
to measure the total intelligence units consumed by the agent in coming
to that goal.
	What happens if we are capable of observing all 3 steps and get
enough samples to decode the goal for each input. We simulate the goal by
just using look-up table. For each input there must be a goal. The size of
this lookup table( in bits) depends on the uncertainty of the environment.

So there are 2 methods of achieving the same goal:

1)Generating alternatives for each input step by step which takes longer or
2)Using lookup table which is faster but needs more storage bits

We can generalize that 1 is using intelligence, 2 using knowledge.

>Instead of measuring the relative intelligence of software
>tools such as the GNU C Compiler, whose intelligence is
>in doubt, measure the intelligence of existing models of
>intelligent systems, such as particular expert systems.  Then

expert systems are just interpreters /compilers.
The rules are the grammar. The objects are the tokens.
	I do not have data for expert system and I do not intend to get one.
 I am hoping someone else could do more research on this.

>there might be a basis for deciding whether your
>approach is technically sound.

Honestly, I also have reservations about my theory but I cannot point it
out. I am hoping someone can find the fault in the theoretical foundation.

My colleagues have criticised my paper but their comments do not affect the
usefulness of this theory for hardware designers.

The above tutorial on information theory may be good for AI people. I would
like to post it to comp.ai to get more comments/corrections. If you do not
reply, I shall assume that you do not mind. Your comments will be included
so that readers will know the relationship of AI with this article.


--
Othman bin Ahmad, School of EEE,
Nanyang Technological University, Singapore 2263.
Internet Email: eoahmad@ntuix.ntu.ac.sg
Bitnet Email: eoahmad@ntuvax.bitnet


