Genetic Algorithms Digest    Tuesday, 3 January 1989    Volume 3 : Issue 1

 - Send submissions to GA-List@AIC.NRL.NAVY.MIL
 - Send administrative requests to GA-List-Request@AIC.NRL.NAVY.MIL

Today's Topics:
	- Re: Classifier System Problems (2 Msgs)
	- Constrained Functional Optimization
	- GA/CS Benchmarks (3 Msgs)

--------------------------------

From: John_Holland@ub.cc.umich.edu
Date: Thu, 29 Dec 88 15:21:32 EST
Subject: Re: Classifier System Problems

     It's probably a useful time to make a comment or two about
classifier systems.  While I generally agree with Michael Hall's
cautionary notes [in V2N26 -JJG], the picture is not as black as he paints it:

     1) It is early to make a judgment of the success or failure of
classifier systems.  Most of Michael Hall's  comments about the
elapsed time without a "real" success would apply equally well to
an earlier stage in the development of genetic algorithms (except
that the elapsed time for genetic algorithms was longer and the
comments were more negative [there are still those, like Terry
Sjenowski, who make naive comparisons between geological
timespans and the speed of genetic algorithms]).

     2) The paper by Robertson and Riolo in the recent special
edition of MACHINE LEARNING on genetic algorithms gives a
good account of some of the realized possibilities for classifier
systems.  The Letseq experiments that are standard in Riolo's
well-documented, standard C-package for classifier systems make it
easy to check out the pluses and minuses of the bucket brigade.

     3) In applications with sparse payoff, neural nets are a poor
alternative to classifier systems.  In their "real" incarnations neural
nets require detailed error feedback on every step.  Classifier
systems can work with payoff (an occasional signal that simply
ranks the outcome of a SEQUENCE of actions), and they can
implement Samuel's method of prediction and revision.

     4) Classifier systems open the possibility of treating individual
rules as context-sensitive building blocks, adding "recombination"
at the level of rules to the recombination supplied by the genetic
algorithm.  This becomes particularly important when one is
dealing with massively parallel systems such as the Connection
Machine.
 
     5)  My own experiments with 'triggered coupling' and
'triggered bridges', those of Robertson and Riolo, and those in
Riolo's dissertation, show that useful chains can be generated and
that there can be a progressive doubling in preserved chain length.
These two operators are robust and simply implemented.
Triggered coupling, in particular, steps around Hall's [Murphy's]
Law of Classifier Systems.
 
     It is much too early to say much about the general contexts in
which the Pitt approach is better than the Michigan approach, or
vice versa, let alone writing off one or the other.  For those of you
interested in the Michigan approach, if you are not too pressed with
requirements for immediate application, take heart.  It is in its
fertile early stages.  The genetic algorithm treats a population made
up of a single species; a classifier system overlays the genetic
algorithm with an ecology of many species.  [Special messages that
play a role like that of pheromones in restricting mating to useful
recombinations offer an interesting, robust way of making a really
sophisticated system, but that's a story for the future.]  You will
learn much about processes that are common to a lot of systems
that build internal models -- the CNS, ecologies, economies,
immune systems and the like.
                                                           John Holland

--------------------------------

From: nvuxh!hall@bellcore.bellcore.com
Date: 23 Dec 1988  21:50 EST
Subject: Murhpy's Law of Classifiers (was Re: Classifier System Problems)

My personal experience with genetic algorithms started out two
years ago, when I was brainwashed with the dreadful equation:
genetic algorithm = classifier system.  I built a classifier system
and applied it to a simple guessing game.  With a great amount of
effort, it managed to come up with a "perfect" set of classifiers.  I
examined the classifiers, and found that it had cleverly solved the
problem with just one classifier, which had come to dominate
the population.  I rewrote the interpretation routines so that the
problem could be solved by a series of two classifiers, but not by a
single classifier.  It failed to come up with a "perfect" solution,
despite the simplicity of the problem.   I believe I had set things
up properly and tried many reasonable settings of parameters, but of
course I could be the one to blame for the failure.

Foolishly, I then attempted to apply a classifier system to a
concept learning problem: the problem was to input corporate tax data
and output whether or not the corporation was "cheating."  (Feedback
was based on comparing the system's answer to the actual result of
the audit.) The results were abysmal.  Plotted over time, the
performance would start to improve as individual classifiers were
recognized as valuable and crossed, but then performance would
crash and formerly strong classifiers would die out; then
performance would improve again.  This cycle continued for some
time until the population sputtered out to a very poor performance
"just guessing" solution.  What was responsible?  PARASITES!

This brings me to the last point of my previous post:

>Occasionally through random mutations and crossover, classifiers are
>produced capable of outputting a message that triggers another
>classifier.  This is a necessary feature of classifier systems, but
>it has an unfortunate consequence:
>
>  MURPHY'S LAW OF CLASSIFIER SYSTEMS:  Harmful classifiers are produced 
>  and "linked" to high strength classifiers more often than beneficial
>  classifiers are produced and "linked" to high strength classifiers.

A worthless classifier of the form ####,####-->1010, for instance,
could attack a high strength classifier of the form 10#1,-001#-->#10#.
(It's not essential to understand that gibble-dee-gook to read the rest.)
A parasite such as this is quite voracious, destroying its host
rapidly and efficiently as the strength is drained almost every iteration. 
Useful classifiers generally do not trigger other classifiers every
iteration, so the problem is compounded.  Of course, what usually
happens is that the host spawns offspring resistant to the parasite.
The host eventually dies off, taking its parasite down with it. 
But before the parasite dies, it can easily produce offspring
capable of attacking the host's successful offspring.  Thus it came
to pass in my system that there were two main classes of
classifiers at war: the honest classifiers and the parasitic classifiers.

Holland and others present such phenomena in a favorable light.
Holland has commented that classifier systems are so much like
natural evolution that parasites have been observed (Machine Learning:
An Artificial Intelligence Approach Volume II), and that
classifier systems are so much like cognition that hallucinations
can occur (from his paper in the second proceedings, I think). 
Westerdale wrote a paper "Altruism in the Bucket Brigade" (in the
second proceedings), but I would like to see a paper entitled
"Bloodsucking in the Bucket Brigade."

For my classifier system attempting to learn the concept of "tax
cheater," Murphy's Law of Classifier Systems caused havoc.  The end
result was that the honest classifiers evolved to become almost totally
parasite-resistant by having virtually no wildcards in their
conditions (i.e. the conditions said "almost never fire").  One
result was that all the parasites died off.  The other result was that
the system became totally useless, doing no better than random
guessing.  In a concept learning problem such as this, you rarely
see the same example twice, so generality is a necessity, but
generality is made impossible by Murphy's Law of Classifier Systems.

Long aside:
  [Incidentally, I applied a concept learning algorithm called PLS1
  (similar to ID3) to the tax audit database and it was able to
  perform considerably better than guessing.  This happens to be a
  particularly difficult problem for which there are no human experts
  and standard statistical techniques have failed.  If you want to learn
  concepts, use statistics-based concept learning algorithms, not
  classifier systems.  For additional evidence see "An Empirical
  Comparison of Genetic and Decision-Tree Classifiers" in the
  Proceedings of the Fifth International Conference on Machine
  Learning.  In this paper, Quinlin pits his ID3 algorithm against
  Wilson's Boole system (a classifier systems sans bucket-brigade -
  hmmm... one step rules and no chance of parasites.) ID3 is shown
  to blow Boole away on the multiplexor problem.  Furthermore, the
  particular problem was originally chosen by Wilson and is rather
  atypical of concept learning problems; it is rather biased towards
  Boole's representation and rather biased away from ID3's representation.
  I see ways of genetic algorithms learning concepts, but classifier
  systems are not a good representation.]
 
However, my main point is general, and not limited to concept
learning: you can't escape Murphy's Law of Classifier Systems. 
Remember, the classifier systems are not directly attempting to
optimize their performance!  The *individuals* in the population are
attempting to live long and prosper, thus propagating their genetic code. 
In the case of my concept learning problem, the individuals found
that the best means of surviving was to essentially cease to
function.  The end result is that the whole system ceased to
function, which is the worst possible solution for the system as a
whole, but it is the best solution for "honest" classifiers in light
of "parasitic" classifiers. 

Murphy's Law of Classifier Systems extends to the Pittsburgh
approach.  However, consider the effects of "bad classifiers" in a
Pittsburgh approach system.  The very worst that can happen is that
an offspring is generated with one or more bad classifiers; big
deal, so what if an individual dies off - the population lives on. 
Furthermore, the genetic operators act to prune away harmful
classifiers from otherwise good individuals.  Fitness in the
Pittsburgh approach is directly correlated to performance, so
performance will almost surely be optimized, at least to a local
optima (unlike the Michigan approach, which is quite capable of generating
the worst possible solution.)

I have had some success with applying the Pittsburgh approach to the
problem of inducing custom-tailored disk scheduling algorithms;
sets of cooperating rules were created and some slightly
outperformed standard disk scheduling algorithms for the particular
simulated environment. Unfortunately, I have never applied
Pittsburgh approach and Michigan approach systems to the same
problem for comparison.  Smith's LS-1 poker learner is the commonly
referenced successful Pittsburgh approach system, but no Michigan
approach system has been applied to that problem (to my knowledge.) 
From my reading of his thesis, I believe that Smith's LS-1 did
induce sets of cooperating rules.

In sum, I have seen no convincing empirical evidence supporting the
value of the bucket-brigade algorithm.  Furthermore, I see Murphy's
Law of Classifier Systems as a plausible explanation of the failure
of the Michigan Approach; in fact it was my gut feeling before I had
implemented the Michigan Approach.  In my own personal experience, I
have found the Pittsburgh approach conceptually and empirically
superior to the Michigan approach, although I have never given them
a "fair" empirical comparison.

Just my opinion.  What's yours?

Michael R. Hall (hall%nvuxh.UUCP@bellcore.COM  OR  bellcore!nvuxh!hall)

--------------------------------

From: POWELL DAVID J <POWELL@ge-crd.arpa>
Date: 27 Dec 88 08:15 EST
Subject: Constrained Functional Optimization

As a relatively new member to the GA community, this bulletin board
has been of tremendous assistance. I have received some of your expert
relies, software and comments which have helped to guide my experimentation
with GA for function optimization. In fact, I have been so pleased with
the results that I am putting a paper together for submission to the GA
conference this summer.

However, I have run into a problem that I would appreciate some comments on.
I am doing constrained functional optimization where I am picking
the best design that meets constraints. For example, assume that my
function has 4 outputs, A, B, C, and D. My goal is to maximize A and
to have B, C, and D meet constraints. 
First, I will present the optimization issue and then the constraint issue:

Optimization:
I may or may not know what a maximum possible optimization value for A is. 
Sometimes I am performing weighted optimization on two or more variables 
such as trying to Maximize A and Minimize B. In this case, (assuming equal
weight) my performance measure is ((current value A)/ (initial value of A) -
	                           (current value B)/ (initial value of B)).
For example, if my initial run of the design had A= 10 and B= 20 and my
second functional optimization had A = 15 and B = 25 then my performance
measure would be ((15/10) - (25 / 20)) = .25.
While this method is very simple, I do not know at the start of the run
what the range of the optimization values will be.

Constraints:
The constraints can be closed on both ends, 85 <= B <= 95, 
or open on a single end , 85 <= B (Note there
is no upper limit here). Furthermore the different variables can have
constraints of different magnitudes, 85 <= B <= 95; .001 <= C <= .007;
and 2160 <= D. In addition, some constraints can be weak and the others
hard. ( A weak constraint is one that can be slightly violated if the
optimization gain is great enough to justify it.)

My problem is picking a penalty function method to properly and fairly
penalize the different magnitude constraints and possible open ended
constraints from a performance value whose range and magnitude is not
known at the start.

I would appreciate any comments, solutions, experiences, references or
software that will help me in this problem.

Finally, I would like to offer special thanks to J. Grefenstette whose software
GENESIS allowed me to quickly determine if GAs were a feasible approach to
my problem and to D. Offutt and M. Hall for their electronic comments and
advice.

Regards
Dave Powell  Powell@ge-crd.arpa or powell@crd.ge.com

--------------------------------

From: smith@Think.COM
Date: Tue, 27 Dec 88 11:40:27 EST
Subject: Benchmark Proposal

Seasons Greetings,

Ever wonder how the Genetic Algorithm or Classifier system that you just
built compares to what other researchers in the GA community have done?  Or
even how such a system would compare to a neural network learning system,
simulated annealing, or some other learning system?  Currently it is not
easy to reproduce results or compare system performance within the GA
community because of the wide diversity of problem domains.  Some default
standards have arisen, but even old friends like the Travelling Salesman
problem are not completely specified.

Our neighbors in the Neural Network community have recently embarked on a
project of defining a standard set of benchmark problem domains to be used
for performance comparisons of different systems.  Such a set of benchmarks
could also be helpful for the GA community.

This letter is a solicitation for ideas on viable problem domains that
could be used as a set of standards to promote unbiased comparisons of GA
systems in the GA community.  It would also serve to provide a set of
benchmark tests for comparisons with other optimization and learning
methodologies.  Two categories of benchmarks are proposed:

  * Learning Problems
  * Function Optimization

This project is meant to be a community effort and its success will be
directly proportional to the variety of researchers who contribute to it.
The goal of the project is to create a multi-author paper that can be
presented at the GA conference this summer.  Time is short so in the next
four weeks benchmark problem domains will be solicited and discussed for
both the Learning and Optimization domains.  If you are interested in
contributing to this project please read and comment on the other two
messages concerning this project contained in this issue of the GA-LIST.
Stewart Wilson and I will be working together to edit and direct the
discussions on the learning problems.  Wayne Mesard will be working on the
function optimization benchmarks.  If you have specific questions
concerning this work please direct them to us otherwise please use the
GA-LIST.

- Stephen J Smith  smith@think.com   Thinking Machines Corporation
- Stewart Wilson   wilson@think.com  Rowland Institute 
- Wayne Mesard     mesard@bbn.com    Bolt Beranek and Newman

--------------------------------

From: smith@Think.COM
Date: Tue, 27 Dec 88 11:40:27 EST
Subject: Benchmark Learning Problems

BENCHMARK LEARNING PROBLEMS

Motivation:

The list of learning problems that have been tackled using classifier or
other GA systems is short, which means that it is not reasonable to use
only these problems for the benchmark suite.  Though we may start with
previous work, the goal of the project will be to define an independent set
of 5 to 10 problems that can easily be implemented and used to test the
efficacy of any learning system.  It may be possible to borrow some
benchmarks from the neural network benchmark project.

The following is the proprosed organization of the discussions:

1. Build a list of all successful classifier and other GA learning systems
    and describe as formally as possible the problems they address.
2. Determine an abstract description of important benchmarks
    (eg. "We should have a task that requires long term memory"...)
3. Propose and discuss specific benchmarks in terms of the
    descriptions that we decide on in 2.
4. Present the results of these discussions at George Mason.

Here is a short list of classifier/GA learning systems that have been built
to date:

CS-1 -			Holland & Reitman
Poker Player -          Stephen F. Smith
Hypothetical Organism - Lashon Booker
Gas Pipeline -          Dave Goldberg
Animat -                Stewart Wilson
Prisoner's Dilemma -    Robert Axelrod
Multiplexer -           Stewart Wilson 
Consumer Choice -       Dave Greene & Stephen F. Smith
Scheduling -            M. R. Hilliard
Letter Sequence -       Rick Riolo &  George Robertson
Hamming Weights -       Dave Davis
RUDI -                  John Grefenstette

A few good questions to ask about this list are:

. What systems are missing from this list?  Pages 219-220 of Goldberg's book
   are a source of possible additions.

. Are any of these problem domains worthy of being a benchmark?  

. What are the characteristics that make these problems good or bad
   benchmarks?

- Stewart and Stephen


--------------------------------

From: mesard@BBN.COM
Date: Fri, 23 Dec 88 15:34:36 -0500
Subject: Request for Participation: An Optimization Test Suite


A FUNCTION OPTIMIZATION TEST SUITE
==================================
Boston Area Researchers in Genetic Algorithms and Inductive Networks
(BARGAIN) is sponsoring an effort to establish a new benchmarking suite for
function optimization systems.  While I will be acting as coordinator
of this effort, a successful standard requires a broad perspective
that addresses the goals and concerns of the genetic algorithm
community as a whole.

Therefore, I am soliciting contributions from the readers of this
list.  If you have a function optimization problem which is difficult,
representative of a class of problems, or interesting in some other
way, BARGAIN would welcome your participation in this project.  Our
goal is to have the test suite prepared and documented in a paper to
be presented at next summer's GA conference at George Mason
University.  Of course, anyone who contributes to this finished
product will receive co-authorship.

Our primary motivation in this undertaking is the hope that a
carefully composed set of non-trivial and eclectic problems could
focus and encourage GA research in the coming years.

SUBMISSIONS 
----------- 
Submissions should describe a problem in sufficient detail for a
researcher to quickly implement an evaluation function for it in a
programming language such as LISP or C.  The coded function should not
be too complex since our goal is to have people actually use it.
Descriptions may be source, equations and/or prose.  All contributions
should be mailed to Mesard@BBN.COM.

Contributors should also include a brief description of why they think
the problem is interesting, what class(es) of problems it belongs to,
what criteria should be used in scoring performance, etc.  And, of
course, clever and amusing names for problems are always welcome.

This is a group effort, not a contest.  All contributors and
interested parties will have the opportunity to be involved in the
discussion, modification and ultimate selection of the test suite
problems.

CLASSES OF PROBLEMS
-------------------
Since GAs are applicable to diverse types of problems, it is my
feeling that we should not try to build a suite which will spit out a
single performance measure of function optimizers.  Rather, it should
be diverse enough that it will point out the strengths and weaknesses
of the optimizers to which it's applied.  This is consistent with the
success many researchers have had recently in developing specialized
operators and modified GA techniques to better suit the specific needs
of the domain in which they're optimizing.

Here is my wish list of areas that should be covered by the test
suite.  This list is by no means complete.  For the sake of brevity I
will just describe the problem types, but I'm willing to send a more
detailed description or example problems via email.

o  Highly Epistatic Problems - Problems where one or more genes
   (nonreciprocally) effect the meaning of another.

o  Super Individual Problems and/or The Trap Problem - In short,
   problems where the optimal solution is hard to find, because
   suboptimal solutions are virtually nonexistent, or located
   elsewhere in weight space (sic).

o  Functions With Nonlinear Periodicity - For example, saw tooth (or
   staircase) functions with ever-taller and ever-narrower tooth (or
   steps).

o  Noisy Functions - That is, functions with a Gaussian component.
   This is, in my view, the most controversial class of problems.
   While it's clearly an important part of many real-world problems,
   its appropriateness in a test suite--where reproducibility is an
   major priority--is questionable.

o  Long Chromosome Problems - At BBN as well as elsewhere, researchers
   are finding that real-world problems for which GAs are well suited
   often have several hundred genes per chromasome.

Important additions to this list are as welcome as specific problem
submissions.

-------

Comments, questions, requests to be added to the mailing discussing
this project, etc., may directed to me or to BARGAIN@BBN.COM.

Wayne_Mesard();          Bolt Beranek and Newman.     Mesard@BBN.COM

--------------------------------

End of Genetic Algorithms Digest
********************************

