The implications of this research reach beyond its relevance to Occam's razor. The post-processor appears to have practical utility in increasing the quality of inferred decision trees. However, if the objective of the research were to improve predictive accuracy rather than to discredit the Occam thesis, the post-processor would be modified in a number of ways.
The first modification would be to enable the addition of multiple partitions at a single leaf from the original tree. C4.5X selects only the single modification for which there is the maximum support. This design decision originated from a desire to minimize the likelihood of performing modifications that will decrease accuracy. In principle, however, it would appear desirable to select all modifications for which there is strong support, each of which could then be inserted into the tree in order of level of supporting evidence.
Even greater increases in accuracy might be expected if one removed the constraint that the post-processing should not alter the performance of the decision tree with respect to the training set. In this case, new partitions may well be found that employ objects from other regions of the instance space to provide evidence in support of adding partitions that correct misclassifications of small numbers of objects at a leaf node from the original tree. The similarity assumption would provide strong evidence for such repartitioning. Such a situation would occur, for example, with respect to the learning problem illustrated in Figure 1, if there was an additional object of class - with attribute values A=2 and B=9. This is illustrated in Figure 3. In this case C4.5 would still create the indicated partitions. However, C4.5X would be unable to relabel the area containing the additional object due to the constraint that it not alter the performance of the original decision tree with respect to the training set. Thus the addition of the object prevents C4.5X from relabeling the shaded region even though, on the basis of the similarity assumption, it improves the evidence in support of that relabeling.
Figure 3: Modified simple instance space
Such an extended post-processor would encourage the following model of inductive inference of decision trees. The role of C4.5 (or a similar system) would be to identify clusters of objects within the instance space that should be grouped under a single leaf node. A second stage would then analyze regions of the instance space that lie outside those clusters in order to allocate classes to those regions. Current decision tree learners, motivated by the Occam thesis, ignore this second stage, leaving regions outside the identified clusters associated with whatever classes have been assigned to them as a by-product of the cluster identification process.