Evolutionary algorithms (EAs) are general purpose searching methods. The selection process and the crossover and mutation operators establish a balance between the exploration and exploitation of the search space which is very adequate for a wide variety of problems whose solution presents difficulties that are insolvable using classical methods. Most of these problems are defined in continuous domains, so the evolutionary algorithms applied use real values, namely, evolution strategies (EPs), real-coded genetic algorithms (RCGAs), and evolutionary programming (EP). For these paradigms the precision of the solution does not depend on the coding system, as in binary coded genetic algorithms, but on the precision of the computer system where the algorithms are run.

The selection process drives the searching towards the regions of the best individuals. The mutation operator randomly modifies, with a given probability, one or more genes of a chromosome, thus increasing the structural diversity of the population. As we can see, it is clearly an exploration operator, that helps to recover the genetic diversity lost during the selection phase and to explore new solutions avoiding premature convergence. In this way, the probability of reaching a given point in the search space is never zero. This operator, in fact, implements a random search whose well-studied features are useful in the field of evolutionary computation.

The crossover operator combines the genes of two or more parents to generate better offspring. It is based on the idea that the exchange of information between good chromosomes will generate even better offspring. The effect of the crossover operator can be studied from two different points of view: at chromosome level and at gene level. The effect of the crossover operator at chromosome level can be considered in a geometric way. Given two parents $ \mathbf{\beta}^1=\{\beta^{1}_1, \beta^{1}_2\}$ and $ \mathbf{\beta}^2=\{\beta^{2}_1, \beta^{2}_2\}$ with two genes, we denote by $ H_{\beta^1\beta^2}$ the hypercube defined by their genes (Figure 1a). At gene level the representation would be linear, defining in this case a segment or interval $ S_{\beta^{1}_i,\beta^{2}_i}$ for each pair of genes (Figure 1b). Most crossover operators generate individuals in the exploitation zones, $ S_{\beta^{1}_i,\beta^{2}_i}$ or $ H_{\beta^1\beta^2}$. In this way, the crossover operator implements a depth search or exploitation, leaving the breadth search or exploration for the mutation operator.

This policy, intuitively very natural, makes the population converge to values within the hypercubes defined by their parents, producing a rapid decrease in the population diversity which could end up in a premature convergence to a non-optimal solution. Recent studies on BLX-$ \alpha$ crossover [ES93], the crossover based on fuzzy connectives [HHVLV94], and fuzzy recombination [VMC95], have confirmed the good performance of those crossover operators that also generate individuals in the exploration zone. These operators avoid the loss of diversity and the premature convergence to inner points of the search space, but also the generation of new individuals in the exploration zone could slow the search process. For this reason, the crossover operator should establish an adequate balance between exploration (or interpolation) and exploitation (or extrapolation), and generate offspring in the exploration and exploitation zones in the correct proportion.

Figure 1: (a) Hypercube defined by the first two genes of the parents; (b) Representation of the segment defined by the $ i$th genes of two chromosomes.
Image RepGenCrom

Establishing a balance between exploration and exploitation is important, but it is also important that such a balance is self-adaptive [Kit01,BD01,DB01], that is, it must guarantee that the dispersion of the offspring depends on the dispersion of the parents. So, two close parents must generate close offspring, and two distant parents must generate distant offspring. The control of dispersion in the crossover based on fuzzy connectives is based on the generation of offspring using the fuzzy connectives t-norms, t-conorms, average functions, and a generalized operator of compensation [Miz89]. In fuzzy recombination the offspring is generated using two triangular distributions whose averages derive from each of the genes of the two parents. In BLX-$ \alpha$ we have the same probability of generating an offspring between the parents, and in an area close to the parents whose amplitude is modulated by the $ \alpha$ parameter.

Ono and Kobayashi [OK97] have proposed a Unimodal Normally Distributed Crossover (UNDX), where three parents are used to generate two or more children. The children are obtained using an ellipsoidal distribution where one axis is the segment that joins the two parents and the extent of the orthogonal direction is decided by the perpendicular distance of the third parent from the axis. The authors claim that this operator should preserve the statistics of the population. This crossover is also self-adaptive, but it differs from BLX-$ \alpha$ by the fact that it is more probable to generate offspring near the average of the first two parents.

Another self-adaptive crossover is the Simulated Binary Crossover (SBX) [DA95]. Based on the search features of the single-point crossover used in binary-coded genetic algorithms, this operator respects the interval schemata processing, in the sense that common interval schemata of the parents are preserved in the offspring. The SBX crossover puts the stress on generating offspring near the parents. So, the crossover guarantees that the extent of the children is proportional to the extent of the parents, and also favors that near parent individuals are monotonically more likely to be chosen as children than individuals distant from the parents.

The main goal of this paper is to propose a crossover operator that avoids the loss of diversity of the population of individuals, and, at the same time, favors the speed of convergence of the algorithm. These two goals are, at first, conflicting; their adequate balance is controlled by two of the basic features of the crossover operator: i) the balance between exploration and exploitation and, ii) the self-adaptive component. These two features make the evolutionary algorithms avoid premature convergence and favor local fine-tuning. Both attributes are highly appreciated in any search algorithm.

In most current crossover operators, the features of the offspring depend on the features of just a few parents. These crossovers do not take into account population features such as localization and dispersion of the individuals. The use of these statistical features of the population may help the convergence of the population towards the global optimum.

The crossover operator implements basically a depth or exploitative search, just like other methods such as steepest gradient descent, local search or simulated annealing, but in these three search methods the algorithm takes the quality of the solutions into account. So, it is reasonable to think that it is also convenient for the crossover operator to consider the performance on the individuals involved in the crossover operation. This idea is already implemented by some heuristic crossovers [Wri91].

Nevertheless, following the previous line of argument, it seems rather poor to use just two parents, and not to consider the most promising directions towards which it would be advisable to drive the search. That is, instead of using a local heuristic that uses two individuals, involving the whole population or an adequate subset in the determination of the direction of the search whose features would be specially suitable.

Motivated by this line of argument, in this paper we propose a crossover operator, which will be called Confidence Interval Based Crossover using $ L_2$ Norm (CIXL2). On the one hand, it takes advantage of the selective component that is derived from the extraction of the features of the best $ n$ individuals of the population and that indicates the direction of the search, and on the other hand, it makes a self-adaptive sampling around those features whose width depends on the number of best individuals, dispersion of those best individuals, confidence coefficient, and localization of the individuals that participate in the crossover. Now, the exploitation region is not the area between the two parents that are involved in the crossover, but the area defined by the confidence interval built from the $ n$ best individuals of the population; and the exploratory region is the rest of the search domain. To the previous concepts of exploration and exploitation, merely geometrical, is added a probabilistic component that depends on the population features of the best individuals.

Estimation of Distribution Algorithms (EDAs) or Probabilistic Model-Building Evolutionary Algorithms [MP98,MMR99] are based on a, seemingly, similar idea. These algorithms do not have mutation and crossover operators. After every generation the population distribution of the selected individuals is estimated and the new individuals are obtained sampling this estimated distribution. However, the underlying idea behind our crossover is the extraction of population features, mean and standard deviation, in order to detect the regions where there is a higher probability of getting the best individuals. In order to perform the crossover, we create three virtual parents that represent the localization estimator mean, and the bounds of the confidence interval from which, with a certain confidence degree, this localization estimator takes the values. In this way, the children generated from these three parents will inherit the features of the best individuals of the population.

The rest of the paper is organized as follows: Section 2 explains the definition of CIXL2 and its features; Section 3 discusses the problem of the selection of the test sets, and justifies the use of a test set based on the one proposed by Eiben and Bäck [EB97a]; Section 4 describes the experimental setup of evolutionary algorithm (RCGA) used in the tests; Section 5 studies the optimal values of the parameters of CIXL2; Section 6 compares the performance of CIXL2 against other crossovers; Section 7 compares CIXL2 with EDAs; Section 8 describes the application of RCGAs with CIXL2 to neural network ensembles; and, finally, Section 9 states the conclusions of our paper and the future research lines.

Domingo 2005-07-11