Analysis of CIXL2

In this section we will perform an analysis of the crossover, and will obtain for every test function the following information:

  1. The optimal value for the confidence coefficient $ 1-\alpha $ of the confidence interval. The values used are $ 1-\alpha=\{0.70, 0.90, 0.95, 0.99\}$.
  2. The optimal number of best individuals used by the crossover to calculate the confidence intervals of the mean. The values used are $ n=\{5, 10, 30, 60, 90\}$.

These two factors are not independent, so we will perform an analysis using all the possible pairs ($ 1-\alpha $, $ n$) of the Cartesian product of the two sets. For each pair we will perform 30 runs of the genetic algorithm with different random seeds. Table 2 shows the average value and standard deviation of the 30 runs for each experiment.

Table 2: Average value and standard deviation of the 30 runs for each experiment
Function n $ \mathbf{1-\alpha}$ Mean St. Dev. $ \mathbf{1-\alpha}$ Mean St. Dev. $ \mathbf{1-\alpha}$ Mean D. Tip. $ \mathbf{1-\alpha}$ Mean St. Dev.
Sphere 5 0.70 6.365e-16 2.456e-16 0.90 4.885e-16 1.969e-16 0.95 3.553e-16 1.710e-16 0.99 1.998e-16 6.775e-17
$ f_{Sph}$ 10 5.736e-15 2.495e-15 2.554e-15 8.934e-16 2.642e-15 1.258e-15 1.480e-15 1.032e-15
30 3.728e-12 1.623e-12 1.446e-11 7.062e-12 2.279e-11 1.256e-11 1.248e-10 5.914e-11
60 6.082e-10 2.499e-10 2.867e-08 1.642e-08 1.557e-07 9.911e-08 5.494e-07 6.029e-07
90 3.838e-09 2.326e-09 4.383e-08 3.068e-08 6.840e-08 5.894e-08 1.061e-07 8.401e-08
Schwefel's 5 0.70 1.995e-03 2.280e-03 0.90 8.403e-03 7.748e-03 0.95 7.662e-03 9.693e-03 0.99 1.305e-02 1.303e-02
double sum 10 2.232e-02 2.859e-02 5.407e-02 3.792e-02 4.168e-02 4.383e-02 1.462e-02 1.422e-02
$ f_{SchDS}$ 30 8.464e-02 1.168e-01 3.190e-01 2.798e-01 2.644e-01 2.569e-01 1.223e-01 9.018e-02
60 1.376e-01 1.202e-01 4.059e-01 2.395e-01 2.223e-01 1.384e-01 2.134e-01 1.464e-01
90 8.048e-01 5.403e-01 2.257e+00 1.490e+00 7.048e-01 7.689e-01 2.799e-01 2.322e-01
Rosenbrock 5 0.70 2.494e+01 1.283e+00 0.90 2.506e+01 3.050e-01 0.95 2.497e+01 4.663e-01 0.99 2.463e+01 1.330e+00
$ f_{Ros}$ 10 2.579e+01 2.044e-01 2.591e+01 1.324e-01 2.589e+01 9.426e-02 2.579e+01 1.609e-01
30 2.611e+01 1.471e-01 2.632e+01 1.745e-01 2.642e+01 1.377e-01 2.668e+01 9.999e-02
60 2.576e+01 1.988e-01 2.593e+01 2.292e-01 2.600e+01 4.045e-01 2.617e+01 4.787e-01
90 2.562e+01 2.827e-01 2.570e+01 2.974e-01 2.579e+01 2.629e-01 2.585e+01 3.654e-01
Rastrigin 5 0.70 2.919e+00 1.809e+00 0.90 6.036e+00 2.023e+00 0.95 7.893e+00 2.450e+00 0.99 7.164e+00 2.579e+00
$ f_{Ras}$ 10 6.799e+00 2.480e+00 1.068e+01 3.786e+00 1.297e+01 3.844e+00 1.675e+01 6.554e+00
30 9.452e+00 2.434e+00 1.270e+01 3.522e+00 1.327e+01 4.770e+00 1.552e+01 3.664e+00
60 1.413e+01 4.126e+00 1.837e+01 6.070e+00 1.499e+01 4.434e+00 1.691e+01 4.123e+00
90 1.771e+01 5.063e+00 2.438e+01 7.688e+00 1.987e+01 5.637e+00 2.249e+01 6.058e+00
Schwefel 5 0.70 6.410e+02 2.544e+02 0.90 1.145e+03 5.422e+02 0.95 1.424e+03 6.837e+02 0.99 2.844e+03 4.168e+02
$ f_{Sch}$ 10 1.793e+03 4.172e+02 1.325e+03 2.340e+02 1.486e+03 2.607e+02 2.525e+03 3.069e+02
30 2.675e+03 2.592e+02 2.264e+03 2.758e+02 2.061e+03 2.369e+02 1.986e+03 2.424e+02
60 2.700e+03 1.471e+02 2.513e+03 1.927e+02 2.496e+03 2.146e+02 2.169e+03 2.434e+02
90 2.738e+03 1.476e+02 2.704e+03 1.516e+02 2.672e+03 1.349e+02 2.529e+03 1.837e+02
Ackley 5 0.70 1.378e-08 5.677e-09 0.90 6.320e-09 2.966e-09 0.95 4.677e-09 1.960e-09 0.99 5.188e-09 2.883e-09
$ f_{Ack}$ 10 2.074e-07 9.033e-08 9.544e-08 3.422e-08 9.396e-08 3.513e-08 5.806e-08 2.683e-08
30 8.328e-06 1.403e-06 1.483e-05 3.956e-06 2.246e-05 4.957e-06 4.976e-05 1.298e-05
60 1.019e-04 2.396e-05 8.292e-04 2.097e-04 1.897e-03 9.190e-04 3.204e-03 1.373e-03
90 2.518e-04 7.167e-05 7.544e-04 2.668e-04 9.571e-02 3.609e-01 1.741e-01 5.290e-01
Griewangk 5 0.70 1.525e-02 1.387e-02 0.90 2.463e-02 2.570e-02 0.95 1.574e-02 1.411e-02 0.99 1.285e-02 1.801e-02
$ f_{Gri}$ 10 1.647e-02 1.951e-02 2.695e-02 2.713e-02 2.195e-02 2.248e-02 3.194e-02 3.680e-02
30 2.012e-02 2.372e-02 1.819e-02 1.664e-02 2.321e-02 3.842e-02 2.254e-02 1.877e-02
60 7.884e-03 1.061e-02 2.808e-02 9.686e-02 7.410e-03 1.321e-02 1.582e-02 2.727e-02
90 7.391e-03 7.617e-03 5.248e-03 6.741e-03 8.938e-03 1.196e-02 1.230e-02 2.356e-02
Fletcher 5 0.70 1.523e+04 1.506e+04 0.90 2.293e+04 1.882e+04 0.95 1.286e+04 1.317e+04 0.99 1.527e+04 1.362e+04
$ f_{Fle}$ 10 1.966e+04 1.585e+04 2.248e+04 2.300e+04 1.633e+04 1.344e+04 1.891e+04 1.612e+04
30 2.145e+04 1.631e+04 2.129e+04 1.310e+04 3.049e+04 2.306e+04 2.492e+04 1.967e+04
60 2.133e+04 2.110e+04 2.124e+04 1.213e+04 2.935e+04 2.155e+04 2.374e+04 1.479e+04
90 2.432e+04 2.273e+04 2.898e+04 3.131e+04 2.918e+04 2.418e+04 3.453e+04 2.498e+04
Langerman 5 0.70 -2.064e-01 9.346e-02 0.90 -2.544e-01 1.401e-01 0.95 -3.545e-01 1.802e-01 0.99 -2.803e-01 1.350e-01
$ f_{Lan}$ 10 -2.339e-01 1.280e-01 -2.582e-01 1.574e-01 -2.663e-01 1.247e-01 -2.830e-01 1.645e-01
30 -2.124e-01 1.038e-01 -2.191e-01 1.100e-01 -1.908e-01 9.776e-02 -2.382e-01 1.572e-01
60 -1.975e-01 1.405e-01 -1.752e-01 7.145e-02 -1.762e-01 8.929e-02 -1.949e-01 9.500e-02
90 -1.599e-01 9.057e-02 -1.336e-01 6.042e-02 -1.656e-01 8.336e-02 -1.796e-01 8.453e-02

The study of the results has been made by means of an analysis of variance ANOVA II [DC74,Mil81,SC80], with the fitness of the best individuals, $ A$, as test variable. This fitness is obtained independently in 30 runs and depending on two fixed factors and their interaction. The fixed factors are: the confidence coefficient $ C$ with four levels and the number of best individuals $ B$ with five levels. The linear model has the form:

$\displaystyle A_{ij}=\mu + C_i + B_j + CB_{ij} + e_{ij}$     (14)
$\displaystyle i=1,2,3,4;{\ }$   and$\displaystyle {\ }j=1,2,3,4,5$      


The hypothesis tests try to determine the effect of each term over the fitness of the best individuals, $ A$. We have carried out tests for every factor and for the interaction among the factors. This and subsequent tests are performed with a confidence level of $ 95\%$. The coefficient $ R^2$ of the linear model tells us the percentage of variance of $ A$ that is explained by the model.

For determining whether there are significant differences among the various levels of a factor, we perform a multiple comparison test of the average fitness obtained with the different levels of each factor. First, we carry out a Levene test [Mil96,Lev60] for evaluating the equality of variances. If the hypothesis that the variances are equal is accepted, we perform a Bonferroni test [Mil96] for ranking the means of each level of the factor. Our aim is to find the level of each factor whose average fitness is significantly better than the average fitness of the rest of the levels of the factor. If the test of Levene results in rejecting the equality of covariance matrixes, we perform a Tamhane test [TD00] instead of a Bonferroni test. Tables 9, 12, and 13 in Appendix A show the results obtained following the above methodology.

For Sphere function, the significant levels $ \alpha ^*$ of each term of the linear model on Table 9 show that none of the factors of the linear model has a significant effect on the model built to explain the variance of the fitness $ A$. This effect is due to the fact that $ f_{Sph}$ is easy to optimize and the fitness behaves as a singular random variable with sample variance near 0. We can see in Table 2 that the best results are obtained with the pair $ (0.99, 5)$. The multiple comparison test of Table 12 confirms that the means obtained with the value $ n=5$ are significatively better than the means obtained with other values. In the same way, the average fitness for $ 1-\alpha=0.70$ is significantly the best one. The results show that, for any value of $ n$, the best value of $ 1-\alpha $, in general, is $ 1-\alpha=0.70$. Due to the simple form of $ f_{Sph}$, the best parameters of the crossover show a high exploitative component with a fast shifting towards the region of the best individuals.

For the unimodal and non-separable functions $ f_{SchDS}$ and $ f_{Ros}$, both factors and their interaction are significant in the linear model that explains the sample variance of $ A$ with a determination coefficient around $ 0.5$. Table 2 shows that the best results are obtained with $ n=5$; the Tamhane test shows that the means obtained with this value of $ n$ are significatively better than the means obtained with other values. The results for the value of the confidence coefficient are less conclusive. In fact, for $ f_{Ros}$ there are no significant differences among the different values of $ 1-\alpha $, although the best results are obtained with $ 1-\alpha = 0.7$. For $ f_{SchDS}$ the average fitness for $ \mu_{0.99}$ is the best one, but without significant differences with $ \mu_{0.70}$. $ \mu_{0.70}$ together with $ n=5$ is the one that shows the best results. We can conclude that the feature of non-separability of the functions does not imply a notable change in the parameters of the crossover with respect to the parameters used for $ f_{Sph}$.

For $ f_{Ras}$ and $ f_{Sch}$, which are separable and multimodal, the most adequate pair of parameters is $ (0.70, 5)$. For $ f_{Ras}$, the test shows that the performance of this pair is significantly better. However, for $ f_{Sch}$, the best mean is obtained with $ \mu_5$ with results that are significantly better than these obtained with other values, with the exception of $ \mu_{10}$. There are no significant differences among $ \mu_{0.70}$, $ \mu_{0.95}$ and $ \mu_{90}$. The three factors of the linear model are significant with quite large determination coefficients of $ 0.617$ for $ f_{Ras}$ and $ 0.805$ for$ f_{Sch}$. This means that the factors and their interaction explain a high percentage of the variance of the fitness $ A$.

For $ f_{Ack}$, the best results are obtained with the pair $ (0.95,
5)$. The Tamhane test confirms that $ n=5$ is the most suitable value, while there are no significant differences among $ \mu_{0.70}$, $ \mu_{0.95}$ and $ \mu_{0.99}$. For $ f_{Gri}$ the best results are obtained with the pair $ (0.90, 90)$. The test shows that large values of $ n$ are the most suitable for the optimization of this function. There are no significant differences among the performance of the different values of $ 1-\alpha $. For both functions the determination coefficient of the linear model is low, showing that the linear model does not explain the variance of the fitness. The lack of a linear relation among $ n$, $ 1-\alpha $ and the fitness makes it more difficult to determine the best value of the parameters of the crossover.

The case of $ f_{Fle}$ and $ f_{Lan}$ is similar, as the linear model hardly gives any information about the effect of the parameters on the fitness. The most adequate pair for the optimization of these two functions is $ (0.95,
5)$. The test shows that the best values of $ n$ are $ n=5$ and $ n=10$. On the other hand, there are no significant differences among the performance of the crossover for the different values of $ 1-\alpha $.

The overall results show that the selection of the best $ n=5$ individuals of the population would suffice for obtaining a localization estimator good enough to guide the search process even for multimodal functions where a small value of $ n$ could favor the convergence to local optima. However, if the virtual parents have a worse fitness than the parent from the population, the offspring is generated near the latter, and the domain can be explored in multiple directions. In this way, the premature convergence to suboptimal virtual parents is avoided.

However, if the best $ n$ individuals are concentrated in a local optimum the algorithm will very likely converge to such optimum. That is the reason why in complex functions a larger value of $ n$ may be reasonable, adding to the confidence interval individuals located in or near different optima. As an example of this, the case of $ f_{Gri}$ for which the best results are achieved with $ n=90$ and $ n=60$ is noteworthy.

The confidence coefficient bounds the error in the determination of the localization parameter and is responsible for focussing the search. The multiple comparison tests show that the value $ 1-\alpha=0.70$ is the best for 6 problems, and is, as least, no worse than the best one in the other problems. So it can be chosen as the most adequate value of the parameter.

Domingo 2005-07-11