Newsgroups: comp.ai.genetic
Path: cantaloupe.srv.cs.cmu.edu!das-news2.harvard.edu!oitnews.harvard.edu!purdue!lerc.nasa.gov!magnus.acs.ohio-state.edu!math.ohio-state.edu!howland.reston.ans.net!news.sprintlink.net!simtel!nexus.coast.net!harbinger.cc.monash.edu.au!yarrina.connect.com.au!labtam!chris
From: chris@labtam.labtam.oz.au (Chris Taylor)
Subject: A cute landscape for playing with GA's
Message-ID: <1995Jun22.061433.25317@labtam.labtam.oz.au>
Keywords: GA landscape
Organization: Labtam Australia Pty. Ltd., Melbourne, Australia
Date: Thu, 22 Jun 1995 06:14:33 GMT
Lines: 125

Here's a fitness landscape that seems quite appropriate
for general purpose testing of GA's and such like.

The problem is simply to make the genome correspond to a string of
ASCII characters (i.e values 0-255), and look for a match with one
particular string (e.g "ABC").

I chose this problem just because it seemed nice and visual.
Watching the genomes throughout the generations one should hopefully
get a visual impression of the evolution converging.

I rated fitness by the sum of how well each character matches
with it's referance character.
The corresponding fitness landscape contains a multitude of local optima
- with mountains upon mountains upon mountains.


For a three character genome (i.e 3 * 8 bit = 24 bit),
the landscape contains 256*256 local optima.
These are arranged on a three level 'mountain range'.
There are 256 mini-triangles within 256 triangles within a mega-triangle.

(For an N character genome there is a N level hierarchy of triangles)

This is a hill-climbers' nightmare.
A hill-climber will quickly find it's way up to one of the local optima,
but there are varying degrees of difficulty in jumping between hills.

It is prone to get stuck in optima like "?BC" which appear seductively
close to the ideal "ABC". This is because '?' is 0x3f and 'A' is 0x41 and
so to escape requires a rather specific multi-bit mutation (or series thereof).

This landscape seems well suited to testing the basic ability of
a GA to cope with a sea of local optima. 


Here's a program to spit out a graph of the landscape.

/** Fitness landscape **/
/** The genome contain three genes **/
/** Each gene is an ASCII character (0-0xff) **/
/** The 'ideal' genome has been royally decreed to be "ABC" **/

/* Fitness of individual gene (triangle peaking at 127 on referance) */
unsigned char fit(x,r)
unsigned char x,r;
{
unsigned char f;
    r = 127 - r;  /* transform to 0..255 */
    f = x + r;
    if (f == 0xff)
        f = 0;
    else
        if (f & 0x80) /* 128..255 */
            f = 254 - f;
    return(f);
}

/* Fitness of genome */
fitness(a,b,c)
unsigned char a,b,c;
{
    return(fit(a,'A') + fit(b,'B') + fit(c,'C'));
}


main()
{
unsigned char i,j,k;
int g;
float max=381.0; /* 127*3 */ 

    for (i=0; i<255; i++)
        for (j=0; j<255; j++)
            for (k=0; k<255; k++) {
                if (k=='C')
                {
                    g = 256*256*i + 256*j + k;
                    printf("%d %f\n", g, (float)fitness(i,j,k)/max);
                }
            }

    /* NOTE - the above loop just plots the lowest order peaks */
    /* i.e The envelope of all the local maxima */

    /* To plot the whole landscape remove the line
                if (k=='C')
       but it gives a humungous number of points....

       Advisable to reduce the i loop limit to just get a segment
       of the fine structure.
    */
}

/** Description of landscape:
    There are 256*256 local maxima (each one 256 genomes wide). 
    The local maxima 'hills' are superimposed onto 256 'mountains'
    and the mountains are superimposed onto a 'mountain range' -
    which peaks at "ABC".
**/



Experimentation with this landscape rather clearly demostrates the
(obvious in hindsight) advantage GA's get from crossover.

A GA with a 0% crossover probability is essentially equivelent to
a duplication of hill-climbers.
With population size N, there are N little hill-climbers that find niches
in the landscape and get stuck there. They each must then wait for
(highly unlikely) multi-bit mutations to work their way toward the
global optimum.

But crossover has a two-fold effect, serving as a multi-bit mutation mechanism
and also providing an information sharing mechanism through which two 
partially correct genomes can combine to form a better offspring.

This can be viewed as an automated "let's restart the hill-climb somewhere
else using semi-informed (as opposed to random) starting points".


Of course too much crossover will tend to corrupt good genes too regularly
so that good genomes just don't get enough time to converge to the global
optimum.

