next up previous
Next: Results Up: Integrative Windowing Previous: Experimental Setup

Domains

We evaluated the algorithms on a variety of reasonably large and noise-free training sets from the UCI collection of Machine Learning databases. As our implementation can only handle 2-class problems, we constructed a binary version of the multi-class Shuttle domain by discriminating examples of majority class from all other classes. In the KRK illegality domain we used a propositional version of the original relational learning problem [46], where each position is encoded with features that correspond to the truth values of the 18 different meaningful instantiations of the adjacent, equal, and less_than relations in the background knowledge.


 
Table 2: Domains used in the experiments, along with their size, the performance of C4.5 with windowing versus C4.5 without windowing, and an estimate for the redundancy of the domain.
Domain Size c4.5 -t1 vs. c4.5 Redundancy
Mushroom 8,124 98.8 % 46.61 %
KRKN 10,000 91.2 % 46.05 %
KRKP 3,196 112.8 % 43.81 %
KRK (prop.) 10,000 113.8 % 21.88 %
Tic-Tac-Toe 958 258.0 % 4.15 %
Binary Shuttle 43,500 55.3 % --
 

Table 2 shows the total number of examples available for each domain and the ratio of the average run-time of C4.5 with windowing (invoked using the parameter setting -t 1) versus C4.5 without windowing. The last column shows the redundancy of the domain, estimated with Møller's conditional population entropy heuristic (2). Interestingly enough, there seems to be a (negative) correlation between the performance of C4.5's windowing algorithm and this redundancy measure.10 In general, the results with C4.5 confirm the results of [59] that not much can be gained with the use of windowing for ID3-like learners. The only exception is the Shuttle domain, where windowing can save almost half of C4.5's run-time.


next up previous
Next: Results Up: Integrative Windowing Previous: Experimental Setup