Experimental Setup

Each set of available data was divided into two subsets: 75% of the patterns were used for learning, and the remaining 25% for testing the generalization of the networks. There are two exceptions, Sonar and Vowel problems, as the patterns of these two problems are prearranged in two specific subsets due to their particular features. A summary of these data sets is shown in Table 5. No validation set was used in our experiments.


Table 5: Summary of data sets. The features of each data set can be C(continuous), B(binary) or N(nominal). The Inputs column shows the number of inputs of the network as it depends not only on the number of input variables but also on their type.
Data set Cases Classes Features Inputs
 - -- Train Test   C B N  
Anneal 674 224 5 6 14 18 59
Autos 154 51 6 15 4 6 72
Balance 469 156 3 4 - - 4
Breast-cancer 215 71 2 - 3 6 15
Card 518 172 2 6 4 5 51
German 750 250 2 6 3 11 61
Glass 161 53 6 9 - - 9
Heart 226 76 2 6 3 4 22
Hepatitis 117 38 2 6 13 - 19
Horse 273 91 3 13 2 5 58
Ionosphere 264 87 2 33 1 - 34
Iris 113 37 3 4 - - 4
Labor 43 14 2 8 3 5 29
Liver 259 86 2 6 - - 2
Lymphography 111 37 4 - 9 6 38
Pima 576 192 2 8 - - 8
Promoters 80 26 2 - - 57 114
Segment 1733 577 7 19 - - 19
Sonar 104 104 2 60 - - 60
Soybean 513 170 19 - 16 19 82
TicTacToe 719 239 2 - - 9 9
Vehicle 635 211 4 18 - - 18
Vote 327 108 2 - 16 - 16
Vowel 528 462 11 10 - - 10
Zoo 76 25 7 1 15 - 16


These data sets cover a wide variety of problems. There are problems with different numbers of available patterns, from 57 to 2310, different numbers of classes, from 2 to 19, different kinds of inputs, nominal, binary and continuous, and of different areas of application, from medical diagnosis to vowel recognition. Testing our model on this wide variety of problems can give us a clear idea of its performance. These are all the sets to which the method has been applied.

In order to test the efficiency of the proposed crossover in a classical artificial intelligence problem, we have used a RCGA to adjust the weight of each network within the ensemble. Our method considers each ensemble as a chromosome and applies a RCGA to optimize the weight of each network. The weight of each network of the ensemble is codified as a real number. The chromosome formed in this way is subject to CIXL2 crossover and non-uniform mutation. The parameters of CIXL2 are the same used in the rest of the paper, $ n=5$ and $ 1-\alpha = 0.7$. The combination method used in the weighted sum of the outputs of the networks. Nevertheless, the same genetic algorithm could be used for weighting each network if a majority voting model is used.

The exact conditions of the experiments for each run of all the algorithms were the following:

Tables 6 and 7 show the results in terms of accurate classification for the 25 problems. The tables show the results using a RCGA with CIXL2, and the standard BEM and GEM methods. In order to compare the three methods we have performed a sign test over the win/draw/loss record of the three algorithms [Web00]. These tests are shown in Table 8.


Table 6: Ensemble results using real-coded genetic algorithm (CIXL2), basic ensemble method (BEM), and generalized ensemble method (GEM). For each problem we have marked whichever CIXL2 is better (+), equal, (=), or worse (-) than BEM/GEM.
Problem   Learning   Test  
  -- --   Mean St.Dev. Best Worst   Mean St.Dev. Best Worst  
Anneal CIXL2 0.9933 0.0046 0.9985 0.9777   0.9778 0.0090 0.9911 0.9420  
  BEM 0.9879 0.0054 0.9955 0.9733   0.9729 0.0091 0.9911 0.9464 +
  GEM 0.9915 0.0054 0.9985 0.9777   0.9780 0.0103 0.9911 0.9420 -
Autos CIXL2 0.8957 0.0233 0.9416 0.8506   0.7261 0.0577 0.8235 0.5882  
  BEM 0.8649 0.0211 0.9091 0.8312   0.7052 0.0586 0.8039 0.5686 +
  GEM 0.8740 0.0262 0.9351 0.8182   0.7033 0.0707 0.8039 0.5294 +
Balance CIXL2 0.9340 0.0067 0.9446 0.9232   0.9201 0.0118 0.9487 0.8910  
  BEM 0.9179 0.0068 0.9318 0.9019   0.9158 0.0111 0.9423 0.8910 +
  GEM 0.9148 0.0101 0.9318 0.8785   0.9158 0.0110 0.9359 0.8910 +
Breast CIXL2 0.8575 0.0195 0.8930 0.8047   0.6892 0.0322 0.7465 0.6338  
  BEM 0.8321 0.0287 0.8698 0.7395   0.6826 0.0375 0.7606 0.6056 +
  GEM 0.8274 0.0314 0.8791 0.7488   0.6817 0.0354 0.7324 0.6056 +
Cancer CIXL2 0.9723 0.0021 0.9771 0.9676   0.9799 0.0065 0.9885 0.9655  
  BEM 0.9678 0.0034 0.9733 0.9600   0.9793 0.0076 0.9943 0.9655 +
  GEM 0.9673 0.0034 0.9733 0.9581   0.9785 0.0084 0.9885 0.9598 +
Card CIXL2 0.9201 0.0087 0.9363 0.9054   0.8574 0.0153 0.8895 0.8256  
  BEM 0.9074 0.0088 0.9247 0.8880   0.8521 0.0212 0.8953 0.7965 +
  GEM 0.9049 0.0093 0.9208 0.8822   0.8533 0.0203 0.8953 0.7965 +
German CIXL2 0.8785 0.0080 0.8973 0.8653   0.7333 0.0184 0.7640 0.7000  
  BEM 0.8587 0.0090 0.8827 0.8440   0.7355 0.0141 0.7600 0.7040 -
  GEM 0.8642 0.0099 0.8827 0.8427   0.7377 0.0149 0.7680 0.7160 -
Glass CIXL2 0.8509 0.0225 0.9006 0.8075   0.6962 0.0365 0.7736 0.6038  
  BEM 0.8043 0.0246 0.8447 0.7578   0.6824 0.0424 0.7925 0.6038 +
  GEM 0.8246 0.0293 0.8820 0.7640   0.6855 0.0479 0.7736 0.6038 +
Heart CIXL2 0.9297 0.0216 0.9653 0.8861   0.8358 0.0271 0.8971 0.7794  
  BEM 0.9089 0.0214 0.9604 0.8663   0.8333 0.0263 0.8824 0.7794 +
  GEM 0.9182 0.0239 0.9554 0.8663   0.8279 0.0312 0.8971 0.7794 +
Hepa. CIXL2 0.9385 0.0224 0.9744 0.8718   0.8702 0.0372 0.9211 0.8158  
  BEM 0.9131 0.0253 0.9573 0.8462   0.8658 0.0319 0.9211 0.8158 +
  GEM 0.9179 0.0289 0.9744 0.8376   0.8711 0.0399 0.9474 0.7895 -
Horse CIXL2 0.8723 0.0174 0.9084 0.8315   0.7044 0.0313 0.7692 0.6264  
  BEM 0.8444 0.0194 0.8718 0.7949   0.7000 0.0301 0.7582 0.6374 +
  GEM 0.8485 0.0207 0.8864 0.8095   0.7004 0.0300 0.7802 0.6484 +
Ionos. CIXL2 0.9635 0.0164 0.9886 0.9356   0.8950 0.0225 0.9195 0.8276  
  BEM 0.9481 0.0171 0.9773 0.9167   0.8920 0.0206 0.9195 0.8276 +
  GEM 0.9554 0.0205 0.9886 0.9167   0.8958 0.0198 0.9310 0.8621 -
Iris CIXL2 1.0000 0.0000 1.0000 1.0000   1.0000 0.0000 1.0000 1.0000  
  BEM 1.0000 0.0000 1.0000 1.0000   1.0000 0.0000 1.0000 1.0000 =
  GEM 1.0000 0.0000 1.0000 1.0000   1.0000 0.0000 1.0000 1.0000 =



Table 7: Ensemble results using real-coded genetic algorithm (CIXL2), basic ensemble method (BEM), and generalized ensemble method (GEM). For each problem we have marked whichever CIXL2 is better (+), equal, (=), or worse (-) than BEM/GEM.
Problem   Learning   Test  
  -- --   Mean St.Dev. Best Worst   Mean St.Dev. Best Worst  
Labor CIXL2 0.9651 0.0257 1.0000 0.8837   0.8857 0.0550 1.0000 0.7857  
  BEM 0.9488 0.0283 0.9767 0.8837   0.8833 0.0663 1.0000 0.7143 +
  GEM 0.9527 0.0270 0.9767 0.8837   0.8833 0.0689 1.0000 0.7143 +
Liver CIXL2 0.8126 0.0175 0.8494 0.7761   0.6992 0.0276 0.7442 0.6512  
  BEM 0.7799 0.0176 0.8108 0.7336   0.6950 0.0253 0.7442 0.6395 +
  GEM 0.7744 0.0198 0.8108 0.7336   0.6826 0.0337 0.7442 0.6047 +
Lymph CIXL2 0.9456 0.0208 0.9730 0.8919   0.7847 0.0538 0.8649 0.6486  
  BEM 0.9318 0.0242 0.9640 0.8739   0.7775 0.0539 0.8649 0.6486 +
  GEM 0.9306 0.0254 0.9730 0.8559   0.7784 0.0504 0.8378 0.6486 +
Pima CIXL2 0.7982 0.0073 0.8194 0.7830   0.7811 0.0209 0.8177 0.7292  
  BEM 0.7782 0.0079 0.7934 0.7535   0.7885 0.0199 0.8177 0.7448 -
  GEM 0.7752 0.0089 0.7882 0.7431   0.7793 0.0222 0.8281 0.7292 +
Promot. CIXL2 0.9496 0.0304 1.0000 0.8875   0.8244 0.0726 1.0000 0.7308  
  BEM 0.9300 0.0357 0.9875 0.8500   0.8269 0.0612 0.9231 0.7308 -
  GEM 0.9263 0.0319 0.9875 0.8625   0.8218 0.0711 0.9615 0.6923 +
Segment CIXL2 0.9502 0.0030 0.9544 0.9446   0.9259 0.0057 0.9376 0.9151  
  BEM 0.9339 0.0042 0.9411 0.9256   0.9183 0.0054 0.9341 0.9081 +
  GEM 0.9423 0.0044 0.9521 0.9319   0.9236 0.0061 0.9359 0.9116 +
Sonar CIXL2 0.9074 0.0236 0.9519 0.8654   0.7849 0.0286 0.8462 0.7404  
  BEM 0.8859 0.0266 0.9423 0.8269   0.7865 0.0286 0.8365 0.7212 -
  GEM 0.8907 0.0277 0.9519 0.8365   0.7853 0.0266 0.8462 0.7404 -
Soybean CIXL2 0.9758 0.0114 0.9903 0.9454   0.9057 0.0165 0.9353 0.8706  
  BEM 0.9602 0.0130 0.9805 0.9240   0.9039 0.0182 0.9353 0.8647 +
  GEM 0.9691 0.0157 0.9883 0.9376   0.9067 0.0187 0.9353 0.8706 -
TicTacToe CIXL2 0.9913 0.0027 0.9972 0.9847   0.9794 0.0024 0.9874 0.9749  
  BEM 0.9868 0.0020 0.9917 0.9847   0.9791 0.0000 0.9791 0.9791 +
  GEM 0.9876 0.0024 0.9930 0.9847   0.9792 0.0008 0.9833 0.9791 +
Vote CIXL2 0.9832 0.0055 0.9939 0.9725   0.9278 0.0110 0.9537 0.8889  
  BEM 0.9793 0.0060 0.9908 0.9664   0.9284 0.0068 0.9444 0.9167 -
  GEM 0.9801 0.0062 0.9908 0.9664   0.9262 0.0107 0.9444 0.8981 +
Vowel CIXL2 0.9146 0.0148 0.9432 0.8845   0.4925 0.0293 0.5606 0.4459  
  BEM 0.8733 0.0179 0.9015 0.8371   0.4913 0.0331 0.5584 0.4264 +
  GEM 0.9157 0.0129 0.9394 0.8845   0.4973 0.0342 0.5541 0.4221 -
Zoo CIXL2 0.9807 0.0175 1.0000 0.9211   0.9360 0.0290 0.9600 0.8800  
  BEM 0.9671 0.0215 1.0000 0.9079   0.9307 0.0392 0.9600 0.8400 +
  GEM 0.9750 0.0203 1.0000 0.9211   0.9307 0.0347 0.9600 0.8400 +



Table 8: Comparison of the three methods. Win/draw/loss record of the algorithms against each other and $ p$-value of the sign test.
Algorithm BEM GEM    
CIXL2 19/1/5 17/1/7   win/draw/loss
  0.0066 0.0639   $ p$-value
BEM   9/4/12   win/draw/loss
    0.6636   $ p$-value


Table 8 shows the comparison statistics for the three models [Web00]. For each model we show the win/draw/loss statistic, where the first value is the number of data sets for which col $ <$ row, the second is the number for which col = row, and the third is the number for which col $ >$ row. The second row shows the $ p$-value of a two-tailed sign test on the win-loss record. The table shows that the genetic algorithm using CIXL2 is able to outperform the two standard algorithms BEM and GEM with a 10% confidence. On the other hand, there are no significant differences between BEM and GEM. This result is especially interesting because we have used a comprehensive set of problems from very different domains, different types of inputs, and different numbers of classes.

Domingo 2005-07-11