The natural process of complexification has led to important biological innovations. Its most natural application in EC is in competitive coevolution, as will be reviewed below.
Mutation in nature not only results in optimizing existing structures: New genes are occasionally added to the genome, allowing evolution to perform a complexifying function over and above optimization. In addition, complexification is protected in nature in that interspecies mating is prohibited. Such speciation creates important dynamics differing from standard GAs. In this section, we discuss these characteristics of natural evolution as a basis for our approach to utilizing them computationally in genetic algorithms.
Gene duplication is a special kind of mutation in which one or more parental genes are copied into an offspring's genome more than once. The offspring then has redundant genes expressing the same proteins. Gene duplication has been responsible for key innovations in overall body morphology over the course of natural evolution .
A major gene duplication event occurred around the time that vertebrates separated from invertebrates. The evidence for this duplication centers around HOX genes, which determine the fate of cells along the anterior-posterior axis of embryos. HOX genes are crucial in shaping the overall pattern of development in embryos. In fact, differences in HOX gene regulation explain a great deal of the diversity among arthropods and tetrapods . Invertebrates have a single HOX cluster while vertebrates have four, suggesting that cluster duplication significantly contributed to elaborations in vertebrate body-plans . The additional HOX genes took on new roles in regulating how vertebrate anterior-posterior axis develops, considerably increasing body-plan complexity. Although argues that the additional clusters can be explained by many single gene duplications accumulating over generations, as opposed to massive whole-genome duplications, researchers agree that gene duplication in some form contributed significantly to body-plan elaboration.
A detailed account of how duplicate genes can take on novel roles was given by : Base pair mutations in the generations following duplication partition the initially redundant regulatory roles of genes into separate classes. Thus, the embryo develops in the same way, but the genes that determine the overall body-plan are confined to more specific roles, since there are more of them. The partitioning phase completes when redundant clusters of genes are separated enough so that they no longer produce identical proteins at the same time. After partitioning, mutations within the duplicated cluster of genes affect different steps in development than mutations within the original cluster. In other words, duplication creates more points at which mutations can occur. In this manner, developmental processes complexify.
Gene duplication is a possible explanation how natural evolution indeed expanded the size of genomes throughout evolution, and provides inspiration for adding new genes to artificial genomes as well. In fact, gene duplication motivated to allow entire functions in genetic programs to be duplicated through a single mutation, and later differentiated through further mutations. When evolving neural networks, this process means adding new neurons and connections to the networks.
In order to implement this idea in artificial evolutionary systems, we are faced with two major challenges. First, such systems evolve differently sized and shaped network topologies, which can be difficult to cross over without losing information. For example, depending on when new structure was added, the same gene may exist at different positions, or conversely, different genes may exist at the same position. Thus, artificial crossover may disrupt evolved topologies through misalignment. Second, with variable-length genomes, it may be difficult to find innovative solutions. Optimizing many genes takes longer than optimizing only a few, meaning that more complex networks may be eliminated from the population before they have a sufficient opportunity to be optimized.
However, biological evolution also operates on variable-length genomes, and these problems did not stop complexification in nature. How are these problems avoided in biological evolution? First, nature has a mechanism for aligning genes with their counterparts during crossover, so that data is not lost nor obscured. This alignment process has been most clearly observed in E. coli. A special protein called RecA takes a single strand of DNA and aligns it with another strand at genes that express the same traits, which are called homologous genes. This process is called synapsis.
Second, innovations in nature are protected through speciation. Organisms with significantly divergent genomes never mate because they are in different species. If any organism could mate with any other, organisms with initially larger, less-fit genomes would be forced to compete for mates with their simpler, more fit counterparts. As a result, the larger, more innovative genomes would fail to produce offspring and disappear from the population. In contrast, in a speciated population, organisms with larger genomes compete for mates among their own species, instead of with the population at large. That way, organisms that may initially have lower fitness than the general population still have a chance to reproduce, giving novel concepts a chance to realize their potential without being prematurely eliminated. Because speciation benefits the evolution of diverse populations, a variety of speciation methods have been employed in EC .
It turns out complexification is also possible in evolutionary computation if abstractions of synapsis and speciation are made part of the genetic algorithm. The NEAT method (section 3) is an implementation of this idea: The genome is complexified by adding new genes which in turn encode new structure in the phenotype, as in biological evolution.
Complexification is especially powerful in open-ended domains where the goal is to continually generate more sophisticated strategies. Competitive coevolution is a particularly important such domain, as will be reviewed in the next section.
In competitive coevolution, individual fitness is evaluated through competition with other individuals in the population, rather than through an absolute fitness measure. In other words, fitness signifies only the relative strengths of solutions; an increased fitness in one solution leads to a decreased fitness for another. Ideally, competing solutions will continually outdo one another, leading to an "arms race" of increasingly better solutions . Competitive coevolution has traditionally been used in two kinds of problems. First, it can be used to evolve interactive behaviors that are difficult to evolve in terms of an absolute fitness function. For example, evolved simulated 3D creatures that attempted to capture a ball before an opponent did, resulting in a variety of effective interactive strategies. Second, coevolution can be used to gain insight into the dynamics of game-theoretic problems. For example, coevolved iterated Prisoner's Dilemma strategies in order to demonstrate how they correspond to stages in natural evolution.
Whatever the goal of a competitive coevolution experiment, interesting strategies will only evolve if the arms race continues for a significant number of generations. In practice, it is difficult to establish such an arms race. Evolution tends to find the simplest solutions that can win, meaning that strategies can switch back and forth between different idiosyncratic yet uninteresting variations . Several methods have been developed to encourage the arms race . For example, a "hall of fame" or a collection of past good strategies can be used to ensure that current strategies remain competitive against earlier strategies. Recently, Ficici and Pollack (2001) and Noble and Watson (2001) introduced a promising method called Pareto coevolution, which finds the best learners and the best teachers in two populations by casting coevolution as a multiobjective optimization problem. This information enables choosing the best individuals to reproduce, as well as maintaining an informative and diverse set of opponents.
Although such techniques allow sustaining the arms race longer, they do not directly encourage continual coevolution, i.e. creating new solutions that maintain existing capabilities. For example, no matter how well selection is performed, or how well competitors are chosen, if the search space is fixed, a limit will eventually be reached. Also, it may occasionally be easier to escape a local optimum by adding a new dimension to the search space than by searching for a new path through the original space.
For these reasons, complexification is a natural technique for establishing a coevolutionary arms race. Complexification elaborates strategies by adding new dimensions to the search space. Thus, progress can be made indefinitely long: Even if a global optimum is reached in the search space of solutions, new dimensions can be added, opening up a higher-dimensional space where even better optima may exist.
To test this idea experimentally, we chose a robot duel domain that combines predator/prey interaction and food foraging in a novel head-to-head competition (Section 4). We use this domain to demonstrate how NEAT uses complexification to continually elaborate solutions. The next section reviews the NEAT neuroevolution method, followed by a description of the robot duel domain and a discussion of the results.