\slidehead{Genetic Algorithms \red \ } \bk
\centerline{[Read Chapter 9]}
\centerline{[Exercises 9.1, 9.2, 9.3, 9.4]}
\bi
\item Evolutionary computation
\item Prototypical GA
\item An example: GABIL
%\item Schema theorem
\item Genetic Programming
\item Individual learning and population evolution
\ei
\slidehead{\large Evoluationary Computation \red \ } \bk
\be
\item Computational procedures patterned after biological evolution
\item Search procedure that probabilistically applies search operators to set
of points in the search space
\ee
\slidehead{\large Biological Evolution \red \ } \bk
Lamarck and others:
\bi \item Species ``transmute'' over time \ei
\bigskip
\bigskip
Darwin and Wallace:
\bi \item Consistent, heritable variation among individuals in population
\item Natural selection of the fittest
\ei
\bigskip
\bigskip
Mendel and genetics:
\bi \item A mechanism for inheriting traits
\item genotype $\ra$ phenotype mapping
\ei
\newpage
\pgm{GA}($Fitness, Fitness\_threshold, p, r, m$)
\bi
\item
{\em Initialize:} $P \leftarrow$ $p$ random hypotheses
\item
{\em Evaluate:} for each $h$ in $P$, compute $Fitness(h)$
\item
While $[\max_{h}Fitness(h)] < Fitness\_threshold$
\bi
\item[1.]
{\em Select:} Probabilistically select $(1-r)p$ members of $P$ to add to
$P_{S}$.
\[ \Pr(h_{i}) = \frac{Fitness(h_{i})}{\sum_{j=1}^{p} Fitness(h_{j})} \]
\item[2.]
{\em Crossover:} Probabilistically select $\frac{r \cdot p}{2}$ pairs of
hypotheses from $P$. For each pair, $\langle h_{1}, h_{2} \rangle$, produce
two offspring by applying the Crossover operator. Add all offspring to
$P_{s}$.
\item[3.]
{\em Mutate:} Invert a randomly selected bit in $m\cdot p$ random members of
$P_{s}$
\item[4.]
{\em Update:} $P \leftarrow P_{s}$
\item[5.]
{\em Evaluate:} for each $h$ in $P$, compute $Fitness(h)$
\ei
\item
Return the hypothesis from $P$ that has the highest fitness.
\ei
\slidehead{\large Representing Hypotheses \red \ } \bk
Represent
\[ (Outlook = Overcast \tor Rain) \tand (Wind = Strong) \]
by
\begin{center}
\begin{tabular}{cc}
$Outlook$ & $Wind$ \\
011 & 10
\end{tabular}
\end{center}
\bigskip
\bigskip
\bigskip
\bigskip
Represent
\[\mbox{IF\ \ } Wind = Strong \mbox{\ \ \ THEN\ \ } PlayTennis=yes \]
by
\begin{center}
\begin{tabular}{ccc}
$Outlook$ & $Wind$ & $PlayTennis$ \\
111 & 10 & 10 \\
\end{tabular}
\end{center}
\slidehead{\large Operators for Genetic Algorithms \red \ } \bk
\centerline{\hbox{\psfig{file=./bookps/ga-recomb.epsf,width=6.0in}}}
\slidehead{\large Selecting Most Fit Hypotheses \red \ } \bk
Fitness proportionate selection:
\[ \Pr(h_{i}) = \frac{Fitness(h_{i})}{\sum_{j=1}^{p} Fitness(h_{j})} \]
... can lead to {\em crowding}
\bigskip
\bigskip
Tournament selection:
\bi
\item Pick $h_1, h_2$ at random with uniform prob.
\item With probability $p$, select the more fit.
\ei
\bigskip
Rank selection:
\bi
\item Sort all hypotheses by fitness
\item Prob of selection is proportional to rank
\ei
\slidehead{\large GABIL [DeJong et al. 1993]\red }
\bk
Learn disjunctive set of propositional rules, competitive with \pgm{C4.5}
\bigskip
{\bf Fitness:}
\[ Fitness(h) = (correct(h))^2 \]
\bigskip
{\bf Representation:}
\[\mbox{IF } a_{1} = T \tand a_{2}=F \mbox{\ THEN\ } c = T
\mbox{; \ IF } a_{2} = T \mbox{\ THEN\ } c = F \]
represented by
\begin{center}
\begin{tabular}{ccccccc}
$a_{1}$ & $a_{2}$ & $c$ & \ & $a_{1}$ & $a_{2}$ & $c$ \\
10 & 01 & 1 & \ \ & 11 & 10 & 0 \\
\end{tabular}
\end{center}
\bigskip
\bigskip
\bigskip
{\bf Genetic operators:} ???
\bi
\item want variable length rule sets
\item want only well-formed bitstring hypotheses
\ei
\slidehead{Crossover with Variable-Length Bitstrings \red }
\bk
Start with
\begin{center}
\begin{tabular}{lccccccc}
\ & $a_{1}$ & $a_{2}$ & $c$ & \ & $a_{1}$ & $a_{2}$ & $c$ \\
$h_{1}:$ & 10 & 01 & 1 & \ \ & 11 & 10 & 0 \\
\ \\
$h_{2}:$ & 01 & 11 & 0 & \ \ & 10 & 01 & 0 \\
\end{tabular}
\end{center}
\be
\item
choose crossover points for $h_1$, e.g., after bits 1, 8
\item
now restrict points in $h_2$ to those that produce bitstrings with
well-defined semantics, e.g., $\langle 1,3 \rangle$, $\langle 1,8 \rangle$,
$\langle 6,8 \rangle$.
\ee
if we choose $\langle 1,3 \rangle$, result is
\begin{center}
\begin{tabular}{lccc}
\ & $a_{1}$ & $a_{2}$ & $c$ \\
$h_{3}:$ & 11 & 10 & 0 \\
\end{tabular}
\begin{tabular}{lcccccccccccc}
\ & $a_{1}$ & $a_{2}$ & $c$ & \ & $a_{1}$ & $a_{2}$ & $c$ & \ & $a_{1}$ & $a_{2}$ & $c$ \\
$h_{4}:$ & 00 & 01 & 1 & \ \ & 11 & 11 & 0 &\ \ & 10 & 01 & 0 \\
\end{tabular}
\end{center}
\slidehead{GABIL Extensions \red }
\bk
Add new genetic operators, also applied probabilistically:
\be
\item
{\em AddAlternative}: generalize constraint on $a_i$ by changing a 0 to 1
\item
{\em DropCondition}: generalize constraint on $a_i$ by changing every 0 to 1
\ee
\bigskip
And, add new field to bitstring to determine whether to allow these
\bigskip
\begin{center}
\begin{tabular}{cccccccccc}
$a_{1}$ & $a_{2}$ & $c$ & \ & $a_{1}$ & $a_{2}$ & $c$ &\ & $AA$ & $DC$ \\
01 & 11 & 0 & \ \ & 10 & 01 & 0 &\ & 1 & 0 \\
\end{tabular}
\end{center}
\bigskip
So now the learning strategy also evolves!
\slidehead{\large GABIL Results \red \ } \bk
Performance of \pgm{GABIL} comparable to symbolic rule/tree learning methods
\pgm{C4.5},
\pgm{ID5R}, \pgm{AQ14}
\bigskip
Average performance on a set of 12 synthetic problems:
\bi
\item \pgm{GABIL} without $AA$ and $DC$ operators: 92.1\% accuracy
\item \pgm{GABIL} with $AA$ and $DC$ operators: 95.2\% accuracy
\item symbolic learning methods ranged from 91.2 to 96.6
\ei
\slidehead{ Schemas \red }
\bk
How to characterize evolution of population in GA?
\bigskip
Schema = string containing 0, 1, * (``don't care'')
\bi
\item Typical schema: 10**0*
\item Instances of above schema: 101101, 100000, ...
\ei
\bigskip
Characterize population by number of instances representing each possible
schema
\bi
\item $m(s,t) =$ number of instances of schema $s$ in pop at time $t$
\ei
\slidehead{\large Consider Just Selection \red \ } \bk
\bi
\item $\bar{f}(t) =$ average fitness of pop. at time $t$
\item $m(s,t) =$ instances of schema $s$ in pop at time $t$
\item $\hat{u}(s,t) =$ ave. fitness of instances of $s$ at time $t$
\ei
Probability of selecting $h$ in one selection step
\begin{eqnarray}
\Pr(h) & = & \frac{f(h)}{\sum_{i=1}^{n} f(h_i)} \nonumber \\
& = & \frac{f(h)}{n \bar{f}(t)} \nonumber
\end{eqnarray}
Probabilty of selecting an instance of $s$ in one step
\begin{eqnarray}
\Pr(h \in s) & = & \sum_{h\in s \cap p_{t}} \frac{f(h)}{n \bar{f}(t)} \nonumber \\
& = & \frac{\hat{u}(s,t)}{n \bar{f}(t)}m(s,t) \nonumber
\end{eqnarray}
Expected number of instances of $s$ after $n$ selections
\begin{eqnarray}
E[m(s,t+1)] & = & \frac{\hat{u}(s,t)}{\bar{f}(t)}m(s,t) \nonumber
\end{eqnarray}
\slidehead{\large Schema Theorem \red \ } \bk
\[E[m(s,t+1)] \geq \frac{\hat{u}(s,t)}{\bar{f}(t)}m(s,t) \left(1 -
p_{c}\frac{d(s)}{l-1}\right) (1 - p_{m})^{o(s)} \]
\bi
\item $m(s,t) =$ instances of schema $s$ in pop at time $t$
\item $\bar{f}(t) =$ average fitness of pop. at time $t$
\item $\hat{u}(s,t) =$ ave. fitness of instances of $s$ at time $t$
\item $p_c =$ probability of single point crossover operator
\item $p_m = $ probability of mutation operator
\item $l = $ length of single bit strings
\item $o(s)$ number of defined (non ``*'') bits in $s$
\item $d(s) = $ distance between leftmost, rightmost defined bits in $s$
\ei
\slidehead{\large Genetic Programming \red \ } \bk
Population of programs represented by trees
\[ \sin(x) + \sqrt{x^{2} + y} \]
\centerline{\hbox{\psfig{file=./bookps/gp-rep.ps,width=3.75in}}}
\slidehead{\large Crossover \red \ } \bk
\centerline{\hbox{\psfig{file=./bookps/gp-crossover.ps,width=4.5in}}}
\slidehead{\large Block Problem \red \ } \bk
\centerline{\hbox{\psfig{file=./bookps/gp-blocks.ps,width=4.5in}}}
Goal: spell UNIVERSAL
\bigskip
\bigskip
Terminals:
\bi
\item
CS (``current stack'') = name of the top block on
stack, or $F$.
\item
TB (``top correct block'') = name of topmost correct block on stack
\item
NN (``next necessary'') = name of the next block needed
above TB in the stack
\ei
\bigskip
\bigskip
\newpage
Primitive functions:
\bi
\item
(MS $x$): (``move to stack''), if block $x$ is on the table, moves $x$ to the
top of the stack and returns the value $T$. Otherwise, does nothing and
returns the value $F$.
\item
(MT $x$): (``move to table''), if block $x$ is somewhere in the stack, moves
the block at the top of the stack to the table and returns the value $T$.
Otherwise, returns $F$.
\item
(EQ $x \ y$): (``equal''), returns $T$ if $x$ equals $y$, and returns $F$
otherwise.
\item
(NOT $x$): returns $T$ if $x=F$, else returns $F$
\item
(DU $x \ y$): (``do until'') executes the expression $x$ repeatedly
until expression $y$ returns the value $T$
\ei
\slidehead{\large Learned Program \red \ } \bk
Trained to fit 166 test problems
\bigskip
Using population of 300 programs, found this after 10 generations:
\bigskip
\centerline{(EQ (DU (MT CS)(NOT CS)) (DU (MS NN)(NOT NN)) )}
\bigskip
\newpage
\slidehead{\large Genetic Programming \red \ } \bk
More interesting example: design electronic filter circuits
\bi
\item Individuals are programs that transform begining circuit to final
circuit, by adding/subtracting components and connections
\item
Use population of 640,000, run on 64 node parallel processor
\item
Discovers circuits competitive with best human designs
\ei
\slidehead{GP for Classifying Images \red \ } \bk
\centerline{[Teller and Veloso, 1997]}
\noindent
{\bf Fitness:}
based on coverage and accuracy
\bigskip
{\bf Representation:}
\bi
\item
Primitives include Add, Sub, Mult, Div, Not, Max, Min,
Read, Write, If-Then-Else, Either, Pixel, Least, Most, Ave, Variance,
Difference, Mini, Library
\item
Mini refers to a local subroutine that is separately co-evolved
\item
Library refers to a global library subroutine (evolved by selecting the most
useful minis)
\ei
\bigskip
{\bf Genetic operators:}
\bi
\item
Crossover, mutation
\item
Create ``mating pools'' and use rank proportionate reproduction
\ei
\slidehead{Biological Evolution \red \ } \bk
Lamark (19th century)
\bi
\item Believed individual genetic makeup was altered by lifetime experience
\item But current evidence contradicts this view
\ei
\bigskip
What is the impact of individual learning on population evolution?
\slidehead{Baldwin Effect \red \ } \bk
Assume
\bi
\item Individual learning has no direct influence on individual DNA
\item But ability to learn reduces need to ``hard wire'' traits in DNA
\ei
Then
\bi
\item Ability of individuals to learn will support more diverse gene pool
\bi \item Because learning allows individuals with various ``hard
wired'' traits to be successful \ei
\item More diverse gene pool will support faster evolution of gene pool
\ei
\bigskip
\bigskip
$\ra$ individual learning (indirectly) increases rate of evolution
\slidehead{Baldwin Effect \red \ } \bk
Plausible example:
\be
\item New predator appears in environment
\item Individuals who can learn (to avoid it) will be selected
\item Increase in learning individuals will support more diverse gene pool
\item resulting in faster evolution
\item possibly resulting in new non-learned traits such as instintive fear of predator
\ee
\slidehead{Computer Experiments on Baldwin Effect \red \ } \bk
\centerline{[Hinton and Nowlan, 1987]}
Evolve simple neural networks:
\bi
\item
Some network weights fixed during lifetime, others trainable
\item
Genetic makeup determines which are fixed, and their weight values
\ei
\bigskip
Results:
\bi
\item
With no individual learning, population failed to improve over time
\item
When individual learning allowed
\bi
\item
Early generations: population contained many individuals with many trainable
weights
\item
Later generations: higher fitness, while number of trainable weights
decreased
\ei
\ei
\slidehead{Summary: Evolutionary Programming \red \ } \bk
\bi
\item Conduct randomized, parallel, hill-climbing search through $H$
\item Approach learning as optimization problem (optimize fitness)
\item Nice feature: evaluation of Fitness can be very indirect
\bi \item consider learning rule set for multistep decision making
\item no issue of assigning credit/blame to indiv. steps
\ei
\ei