Experimental Setup
Each set of available data was divided into two subsets: 75% of the
patterns were used for learning, and the remaining 25% for testing
the generalization of the networks. There are two exceptions, Sonar
and Vowel problems, as the patterns of these two problems are
prearranged in two specific subsets due to their particular
features. A summary of these data sets is shown in Table
5. No validation set was used in our experiments.
Table 5:
Summary of data sets. The features of each data set can be
C(continuous), B(binary) or N(nominal). The Inputs column shows the
number of inputs of the network as it depends not only on the number
of input variables but also on their type.
Data set |
Cases |
Classes |
Features |
Inputs |
- -- |
Train |
Test |
|
C |
B |
N |
|
Anneal |
674 |
224 |
5 |
6 |
14 |
18 |
59 |
Autos |
154 |
51 |
6 |
15 |
4 |
6 |
72 |
Balance |
469 |
156 |
3 |
4 |
- |
- |
4 |
Breast-cancer |
215 |
71 |
2 |
- |
3 |
6 |
15 |
Card |
518 |
172 |
2 |
6 |
4 |
5 |
51 |
German |
750 |
250 |
2 |
6 |
3 |
11 |
61 |
Glass |
161 |
53 |
6 |
9 |
- |
- |
9 |
Heart |
226 |
76 |
2 |
6 |
3 |
4 |
22 |
Hepatitis |
117 |
38 |
2 |
6 |
13 |
- |
19 |
Horse |
273 |
91 |
3 |
13 |
2 |
5 |
58 |
Ionosphere |
264 |
87 |
2 |
33 |
1 |
- |
34 |
Iris |
113 |
37 |
3 |
4 |
- |
- |
4 |
Labor |
43 |
14 |
2 |
8 |
3 |
5 |
29 |
Liver |
259 |
86 |
2 |
6 |
- |
- |
2 |
Lymphography |
111 |
37 |
4 |
- |
9 |
6 |
38 |
Pima |
576 |
192 |
2 |
8 |
- |
- |
8 |
Promoters |
80 |
26 |
2 |
- |
- |
57 |
114 |
Segment |
1733 |
577 |
7 |
19 |
- |
- |
19 |
Sonar |
104 |
104 |
2 |
60 |
- |
- |
60 |
Soybean |
513 |
170 |
19 |
- |
16 |
19 |
82 |
TicTacToe |
719 |
239 |
2 |
- |
- |
9 |
9 |
Vehicle |
635 |
211 |
4 |
18 |
- |
- |
18 |
Vote |
327 |
108 |
2 |
- |
16 |
- |
16 |
Vowel |
528 |
462 |
11 |
10 |
- |
- |
10 |
Zoo |
76 |
25 |
7 |
1 |
15 |
- |
16 |
|
These data sets cover a wide variety of problems. There are problems
with different numbers of available patterns, from 57 to 2310,
different numbers of classes, from 2 to 19, different kinds of inputs,
nominal, binary and continuous, and of different areas of application,
from medical diagnosis to vowel recognition. Testing our model on this
wide variety of problems can give us a clear idea of its performance.
These are all the sets to which the method has been applied.
In order to test the efficiency of the proposed crossover in a
classical artificial intelligence problem, we have used a RCGA to
adjust the weight of each network within the ensemble. Our method
considers each ensemble as a chromosome and applies a RCGA to optimize
the weight of each network. The weight of each network of the ensemble
is codified as a real number. The chromosome formed in this way is
subject to CIXL2 crossover and non-uniform mutation. The parameters of
CIXL2 are the same used in the rest of the paper, and
. The combination method used in the weighted sum of the outputs
of the networks. Nevertheless, the same genetic algorithm could be
used for weighting each network if a majority voting model is used.
The exact conditions of the experiments for each run of all the
algorithms were the following:
- The ensemble was formed by 30 networks. Each network was
trained separately using and standard back-propagation algorithm using
the learning data.
- Once the 30 networks have been trained, the different methods
for obtaining the weights were applied. So, all the methods use the
same ensemble of networks on each run of the experiment. For the
genetic algorithm, the fitness of each individual of the population is
the classification accuracy over the learning set.
- After obtaining the vector of weights, the generalization error
of each method is evaluated using the testing data.
Tables 6 and 7 show the results in
terms of accurate classification for the 25 problems. The tables show
the results using a RCGA with CIXL2, and the standard BEM and GEM
methods. In order to compare the three methods we have performed a
sign test over the win/draw/loss record of the three algorithms
[Web00]. These tests are shown in Table 8.
Table 6:
Ensemble results using real-coded genetic algorithm (CIXL2),
basic ensemble method (BEM), and generalized ensemble method
(GEM). For each problem we have marked whichever CIXL2 is better (+),
equal, (=), or worse (-) than BEM/GEM.
Problem |
|
Learning |
|
Test |
|
-- -- |
|
Mean |
St.Dev. |
Best |
Worst |
|
Mean |
St.Dev. |
Best |
Worst |
|
Anneal |
CIXL2 |
0.9933 |
0.0046 |
0.9985 |
0.9777 |
|
0.9778 |
0.0090 |
0.9911 |
0.9420 |
|
|
BEM |
0.9879 |
0.0054 |
0.9955 |
0.9733 |
|
0.9729 |
0.0091 |
0.9911 |
0.9464 |
+ |
|
GEM |
0.9915 |
0.0054 |
0.9985 |
0.9777 |
|
0.9780 |
0.0103 |
0.9911 |
0.9420 |
- |
Autos |
CIXL2 |
0.8957 |
0.0233 |
0.9416 |
0.8506 |
|
0.7261 |
0.0577 |
0.8235 |
0.5882 |
|
|
BEM |
0.8649 |
0.0211 |
0.9091 |
0.8312 |
|
0.7052 |
0.0586 |
0.8039 |
0.5686 |
+ |
|
GEM |
0.8740 |
0.0262 |
0.9351 |
0.8182 |
|
0.7033 |
0.0707 |
0.8039 |
0.5294 |
+ |
Balance |
CIXL2 |
0.9340 |
0.0067 |
0.9446 |
0.9232 |
|
0.9201 |
0.0118 |
0.9487 |
0.8910 |
|
|
BEM |
0.9179 |
0.0068 |
0.9318 |
0.9019 |
|
0.9158 |
0.0111 |
0.9423 |
0.8910 |
+ |
|
GEM |
0.9148 |
0.0101 |
0.9318 |
0.8785 |
|
0.9158 |
0.0110 |
0.9359 |
0.8910 |
+ |
Breast |
CIXL2 |
0.8575 |
0.0195 |
0.8930 |
0.8047 |
|
0.6892 |
0.0322 |
0.7465 |
0.6338 |
|
|
BEM |
0.8321 |
0.0287 |
0.8698 |
0.7395 |
|
0.6826 |
0.0375 |
0.7606 |
0.6056 |
+ |
|
GEM |
0.8274 |
0.0314 |
0.8791 |
0.7488 |
|
0.6817 |
0.0354 |
0.7324 |
0.6056 |
+ |
Cancer |
CIXL2 |
0.9723 |
0.0021 |
0.9771 |
0.9676 |
|
0.9799 |
0.0065 |
0.9885 |
0.9655 |
|
|
BEM |
0.9678 |
0.0034 |
0.9733 |
0.9600 |
|
0.9793 |
0.0076 |
0.9943 |
0.9655 |
+ |
|
GEM |
0.9673 |
0.0034 |
0.9733 |
0.9581 |
|
0.9785 |
0.0084 |
0.9885 |
0.9598 |
+ |
Card |
CIXL2 |
0.9201 |
0.0087 |
0.9363 |
0.9054 |
|
0.8574 |
0.0153 |
0.8895 |
0.8256 |
|
|
BEM |
0.9074 |
0.0088 |
0.9247 |
0.8880 |
|
0.8521 |
0.0212 |
0.8953 |
0.7965 |
+ |
|
GEM |
0.9049 |
0.0093 |
0.9208 |
0.8822 |
|
0.8533 |
0.0203 |
0.8953 |
0.7965 |
+ |
German |
CIXL2 |
0.8785 |
0.0080 |
0.8973 |
0.8653 |
|
0.7333 |
0.0184 |
0.7640 |
0.7000 |
|
|
BEM |
0.8587 |
0.0090 |
0.8827 |
0.8440 |
|
0.7355 |
0.0141 |
0.7600 |
0.7040 |
- |
|
GEM |
0.8642 |
0.0099 |
0.8827 |
0.8427 |
|
0.7377 |
0.0149 |
0.7680 |
0.7160 |
- |
Glass |
CIXL2 |
0.8509 |
0.0225 |
0.9006 |
0.8075 |
|
0.6962 |
0.0365 |
0.7736 |
0.6038 |
|
|
BEM |
0.8043 |
0.0246 |
0.8447 |
0.7578 |
|
0.6824 |
0.0424 |
0.7925 |
0.6038 |
+ |
|
GEM |
0.8246 |
0.0293 |
0.8820 |
0.7640 |
|
0.6855 |
0.0479 |
0.7736 |
0.6038 |
+ |
Heart |
CIXL2 |
0.9297 |
0.0216 |
0.9653 |
0.8861 |
|
0.8358 |
0.0271 |
0.8971 |
0.7794 |
|
|
BEM |
0.9089 |
0.0214 |
0.9604 |
0.8663 |
|
0.8333 |
0.0263 |
0.8824 |
0.7794 |
+ |
|
GEM |
0.9182 |
0.0239 |
0.9554 |
0.8663 |
|
0.8279 |
0.0312 |
0.8971 |
0.7794 |
+ |
Hepa. |
CIXL2 |
0.9385 |
0.0224 |
0.9744 |
0.8718 |
|
0.8702 |
0.0372 |
0.9211 |
0.8158 |
|
|
BEM |
0.9131 |
0.0253 |
0.9573 |
0.8462 |
|
0.8658 |
0.0319 |
0.9211 |
0.8158 |
+ |
|
GEM |
0.9179 |
0.0289 |
0.9744 |
0.8376 |
|
0.8711 |
0.0399 |
0.9474 |
0.7895 |
- |
Horse |
CIXL2 |
0.8723 |
0.0174 |
0.9084 |
0.8315 |
|
0.7044 |
0.0313 |
0.7692 |
0.6264 |
|
|
BEM |
0.8444 |
0.0194 |
0.8718 |
0.7949 |
|
0.7000 |
0.0301 |
0.7582 |
0.6374 |
+ |
|
GEM |
0.8485 |
0.0207 |
0.8864 |
0.8095 |
|
0.7004 |
0.0300 |
0.7802 |
0.6484 |
+ |
Ionos. |
CIXL2 |
0.9635 |
0.0164 |
0.9886 |
0.9356 |
|
0.8950 |
0.0225 |
0.9195 |
0.8276 |
|
|
BEM |
0.9481 |
0.0171 |
0.9773 |
0.9167 |
|
0.8920 |
0.0206 |
0.9195 |
0.8276 |
+ |
|
GEM |
0.9554 |
0.0205 |
0.9886 |
0.9167 |
|
0.8958 |
0.0198 |
0.9310 |
0.8621 |
- |
Iris |
CIXL2 |
1.0000 |
0.0000 |
1.0000 |
1.0000 |
|
1.0000 |
0.0000 |
1.0000 |
1.0000 |
|
|
BEM |
1.0000 |
0.0000 |
1.0000 |
1.0000 |
|
1.0000 |
0.0000 |
1.0000 |
1.0000 |
= |
|
GEM |
1.0000 |
0.0000 |
1.0000 |
1.0000 |
|
1.0000 |
0.0000 |
1.0000 |
1.0000 |
= |
|
Table 7:
Ensemble results using real-coded genetic algorithm (CIXL2),
basic ensemble method (BEM), and generalized ensemble method
(GEM). For each problem we have marked whichever CIXL2 is better (+),
equal, (=), or worse (-) than BEM/GEM.
Problem |
|
Learning |
|
Test |
|
-- -- |
|
Mean |
St.Dev. |
Best |
Worst |
|
Mean |
St.Dev. |
Best |
Worst |
|
Labor |
CIXL2 |
0.9651 |
0.0257 |
1.0000 |
0.8837 |
|
0.8857 |
0.0550 |
1.0000 |
0.7857 |
|
|
BEM |
0.9488 |
0.0283 |
0.9767 |
0.8837 |
|
0.8833 |
0.0663 |
1.0000 |
0.7143 |
+ |
|
GEM |
0.9527 |
0.0270 |
0.9767 |
0.8837 |
|
0.8833 |
0.0689 |
1.0000 |
0.7143 |
+ |
Liver |
CIXL2 |
0.8126 |
0.0175 |
0.8494 |
0.7761 |
|
0.6992 |
0.0276 |
0.7442 |
0.6512 |
|
|
BEM |
0.7799 |
0.0176 |
0.8108 |
0.7336 |
|
0.6950 |
0.0253 |
0.7442 |
0.6395 |
+ |
|
GEM |
0.7744 |
0.0198 |
0.8108 |
0.7336 |
|
0.6826 |
0.0337 |
0.7442 |
0.6047 |
+ |
Lymph |
CIXL2 |
0.9456 |
0.0208 |
0.9730 |
0.8919 |
|
0.7847 |
0.0538 |
0.8649 |
0.6486 |
|
|
BEM |
0.9318 |
0.0242 |
0.9640 |
0.8739 |
|
0.7775 |
0.0539 |
0.8649 |
0.6486 |
+ |
|
GEM |
0.9306 |
0.0254 |
0.9730 |
0.8559 |
|
0.7784 |
0.0504 |
0.8378 |
0.6486 |
+ |
Pima |
CIXL2 |
0.7982 |
0.0073 |
0.8194 |
0.7830 |
|
0.7811 |
0.0209 |
0.8177 |
0.7292 |
|
|
BEM |
0.7782 |
0.0079 |
0.7934 |
0.7535 |
|
0.7885 |
0.0199 |
0.8177 |
0.7448 |
- |
|
GEM |
0.7752 |
0.0089 |
0.7882 |
0.7431 |
|
0.7793 |
0.0222 |
0.8281 |
0.7292 |
+ |
Promot. |
CIXL2 |
0.9496 |
0.0304 |
1.0000 |
0.8875 |
|
0.8244 |
0.0726 |
1.0000 |
0.7308 |
|
|
BEM |
0.9300 |
0.0357 |
0.9875 |
0.8500 |
|
0.8269 |
0.0612 |
0.9231 |
0.7308 |
- |
|
GEM |
0.9263 |
0.0319 |
0.9875 |
0.8625 |
|
0.8218 |
0.0711 |
0.9615 |
0.6923 |
+ |
Segment |
CIXL2 |
0.9502 |
0.0030 |
0.9544 |
0.9446 |
|
0.9259 |
0.0057 |
0.9376 |
0.9151 |
|
|
BEM |
0.9339 |
0.0042 |
0.9411 |
0.9256 |
|
0.9183 |
0.0054 |
0.9341 |
0.9081 |
+ |
|
GEM |
0.9423 |
0.0044 |
0.9521 |
0.9319 |
|
0.9236 |
0.0061 |
0.9359 |
0.9116 |
+ |
Sonar |
CIXL2 |
0.9074 |
0.0236 |
0.9519 |
0.8654 |
|
0.7849 |
0.0286 |
0.8462 |
0.7404 |
|
|
BEM |
0.8859 |
0.0266 |
0.9423 |
0.8269 |
|
0.7865 |
0.0286 |
0.8365 |
0.7212 |
- |
|
GEM |
0.8907 |
0.0277 |
0.9519 |
0.8365 |
|
0.7853 |
0.0266 |
0.8462 |
0.7404 |
- |
Soybean |
CIXL2 |
0.9758 |
0.0114 |
0.9903 |
0.9454 |
|
0.9057 |
0.0165 |
0.9353 |
0.8706 |
|
|
BEM |
0.9602 |
0.0130 |
0.9805 |
0.9240 |
|
0.9039 |
0.0182 |
0.9353 |
0.8647 |
+ |
|
GEM |
0.9691 |
0.0157 |
0.9883 |
0.9376 |
|
0.9067 |
0.0187 |
0.9353 |
0.8706 |
- |
TicTacToe |
CIXL2 |
0.9913 |
0.0027 |
0.9972 |
0.9847 |
|
0.9794 |
0.0024 |
0.9874 |
0.9749 |
|
|
BEM |
0.9868 |
0.0020 |
0.9917 |
0.9847 |
|
0.9791 |
0.0000 |
0.9791 |
0.9791 |
+ |
|
GEM |
0.9876 |
0.0024 |
0.9930 |
0.9847 |
|
0.9792 |
0.0008 |
0.9833 |
0.9791 |
+ |
Vote |
CIXL2 |
0.9832 |
0.0055 |
0.9939 |
0.9725 |
|
0.9278 |
0.0110 |
0.9537 |
0.8889 |
|
|
BEM |
0.9793 |
0.0060 |
0.9908 |
0.9664 |
|
0.9284 |
0.0068 |
0.9444 |
0.9167 |
- |
|
GEM |
0.9801 |
0.0062 |
0.9908 |
0.9664 |
|
0.9262 |
0.0107 |
0.9444 |
0.8981 |
+ |
Vowel |
CIXL2 |
0.9146 |
0.0148 |
0.9432 |
0.8845 |
|
0.4925 |
0.0293 |
0.5606 |
0.4459 |
|
|
BEM |
0.8733 |
0.0179 |
0.9015 |
0.8371 |
|
0.4913 |
0.0331 |
0.5584 |
0.4264 |
+ |
|
GEM |
0.9157 |
0.0129 |
0.9394 |
0.8845 |
|
0.4973 |
0.0342 |
0.5541 |
0.4221 |
- |
Zoo |
CIXL2 |
0.9807 |
0.0175 |
1.0000 |
0.9211 |
|
0.9360 |
0.0290 |
0.9600 |
0.8800 |
|
|
BEM |
0.9671 |
0.0215 |
1.0000 |
0.9079 |
|
0.9307 |
0.0392 |
0.9600 |
0.8400 |
+ |
|
GEM |
0.9750 |
0.0203 |
1.0000 |
0.9211 |
|
0.9307 |
0.0347 |
0.9600 |
0.8400 |
+ |
|
Table 8 shows the comparison statistics for the three
models [Web00]. For each model we show the win/draw/loss
statistic, where the first value is the number of data sets for which
col row, the second is the number for which col = row,
and the third is the number for which col row. The second
row shows the -value of a two-tailed sign test on the win-loss
record. The table shows that the genetic algorithm using CIXL2 is
able to outperform the two standard algorithms BEM and GEM with a 10%
confidence. On the other hand, there are no significant differences
between BEM and GEM. This result is especially interesting because we
have used a comprehensive set of problems from very different domains,
different types of inputs, and different numbers of classes.
Domingo
2005-07-11