Conclusions and Future Work

In this paper we have proposed a crossover operator that allows the offspring to inherit features common to the best individuals of the population. The extraction of such common features is carried out by the determination of confidence intervals of the mean of the best individuals of the population. From these confidence intervals, CIXL2 creates three virtual parents that are used to implement a directed search towards the region of the fittest individuals. The amplitude and speed of the search is determined by the number of best individuals selected and the confidence coefficient.

The study carried out in order to obtain the best parameters for CIXL2 concludes that the value of $ n=5$ best individuals is suitable to obtain the localization estimator to guide the search in most of the problems tested. However, in very difficult problems, it would be advisable to have a larger value of $ n$ to avoid the premature convergence of the evolutionary process. The confident coefficient, $ 1-\alpha $, is responsible, together with the dispersion of the best individuals, for the modulation of the wideness of the confidence interval centered on the localization estimator. The study results in the best value of $ 1-\alpha=0.70$. This pair of values has an acceptable performance for all problems, although there is not an optimum pair of values for all problems.

The comparative analysis of the crossover operators shows that CIXL2 is a good alternative to widely used crossovers such as $ BLX_\alpha$ for unimodal function such as $ f_{Sph}$, $ f_{SchDS}$, and $ f_{Ros}$. Noteworthy is the performance of CIXL2 in the two non-separable functions, $ f_{SchDS}$ and $ f_{Ros}$, where the other crossovers have a disparate behavior.

If in unimodal functions the strategy of extracting the statistical features of localization and dispersion of the best individuals is a guarantee of good performance, the case for multimodal functions is quite different, and the performance of the algorithm is not assured a priori. Nevertheless, the results obtained for this kind of functions show that CIXL2 is always one of the best performing operators. For instance, in functions of a high complexity such as $ f_{Ack}$ -- multimodal, non-separable and regular -- and $ f_{Fle}$ -- multimodal, non-separable and irregular -- CIXL2 obtains the best results. This behavior reveals that the determination of the region of the best individuals by means of confidence intervals provides a robust methodology that, applied to crossover operator, shows an interesting performance even in very difficult functions. In summary, we can affirm that this paper proves that CIXL2 is a promising alternative to bear in mind, when we must choose which crossover to use in a real-coded genetic algorithm.

EDAs have shown very good performance for unimodal and separable functions, $ f_{Sph}$, and for functions whose optima are regularly distributed, $ f_{Ack}$ and $ f_{Gri}$. The performance of EDAs decreases in multimodal, $ f_{Ras}$ and $ f_{Sch}$, and epistatic functions, $ f_{SchDS}$ and $ f_{Ros}$. On the other hand, CIXL2 is less sensitive to the type of function. The main reason for this behavior may be found in the fact that CIXL2 uses the distribution information obtained from the best individuals of the population differently. CIXL2 creates three virtual parents from this distribution, but if the virtual parents have worse fitness than the individual which mates, the offspring is not generated near these virtual parents. In this way, CIXL2 prevents a shifting of the population to the confidence interval if the improvement of the performance is not significant.

The applicability of the proposed crossover to a problem of artificial neural network ensembles shows how this model can be used for solving standard artificial intelligence problems. RCGAs with CIXL2 can also be used in other aspects of ensemble design, such as, selection of a subset of networks, and sampling of the training set of each network.

These promising results motivate the beginning of a new line of research geared to the study of the distribution of the best individuals taking into account the kind of problem at hand. We aim to propose new techniques of selection of individuals to be considered for obtaining the confidence interval in a more reliable way. In multimodal, irregular, or with many chaotically scattered optima functions the difficulty of obtaining the distributions of the best individuals is enormous. In these kind of functions it would be interesting to perform a cluster analysis of the selected best individuals and to obtain a confidence interval for every cluster. This idea would allow the implementation of a multi-directional crossover towards different promising regions.

On the other hand, it is likely that as the evolutive process progresses the distribution of the best individuals changes. In such a case, it would be advisable to perform, at regular intervals, statistical tests to determine the distribution that best reflects the features of the best individuals on the population.

Alternatively, we are considering the construction of non-parametric confidence intervals. In this way, we need more robust estimators of the parameters of localization and dispersion of the genes of the best individuals. We have performed some preliminary studies using the median and different measures of dispersion and the results are quite encouraging.

Another research line currently open is the study of the application of CIXL2 to problems of optimization with restrictions, especially in the presence of non-linearity, where the generation of individuals in the feasible region is a big issue. The orientation of the search based on the identification of the region of the best individuals that is implemented by CIXL2 could favor the generation of feasible individuals. This feature would be an interesting advantage with respect to other crossover operators.

The authors would like to acknowledge R. Moya-Sánchez for her helping in the final version of this paper.

This work has been financed in part by the project TIC2002-04036-C05-02 of the Spanish Inter-Ministerial Commission of Science and Technology (CICYT) and FEDER funds.

Domingo 2005-07-11