While the above process could be applied to both continuous and discrete attributes, the current implementation addresses only continuous attributes.

The post-processor operates by examining each leaf *l* of the tree in turn.
For each *l*, each attribute *a* is considered in turn. For each *a*, all
possible thresholds below and above the region of the instance space occupied
by objects at *l* are explored. First, the minimum (*min*) and maximum
(*max*) are determined for values of *a* that are possible for objects that
can reach *l*. If *l* lies below the
<= branch of a split on *a* then the threshold for that split
provides an upper limit (*max*) on values for *a* at *l*. If it lies below a >
branch, the threshold provides a lower limit (*min*). Where the node does not
lie below a <= branch, . Where the node does not lie
below a > branch, . Only objects from the training set that
have values of *a* within the range *min*..*max* are considered in the following
operations.

For each value observed in the training set for the attribute within the
allowable range but outside the actual range of values of *a* for objects at
*l*, the evidence is evaluated in support of reclassifying the region above or
below that threshold. The level of support for a given threshold is evaluated
using a Laplacian accuracy estimate
[Niblett and Bratko, 1986]. Because each leaf relates to a
binary classification (an object belongs to the class in question or
does not), the binary form of Laplace is used. For threshold *t* on attribute *a* at leaf *l*, the
evidence in support of labeling the partition below *t* with class *n* is
the maximum value for an ancestor node *x* of *l* for the formula

where *T* is the number of objects at *x* for which *min* < *a*
<= *t*; and *P* is the number of those objects which belong to class
*n*.

The evidence in support of labeling a partition above a threshold is
calculated identically with the exception that the objects for which
*t* < *a* <= *max* are instead considered.

If the maximum evidence for a new labeling exceeds the evidence for the current labeling of the region, a new branch is added for the appropriate threshold creating a new leaf node labeled with the appropriate class.

In addition to evidence in favor of the current labeling gathered as
above, further evidence in support of the current labeling of a region
is calculated using the Laplace accuracy estimate considering the
objects at the leaf, where *T* is the number of objects at
the leaf and *P* is the number of those objects that belong to the
class with which the node is labeled.

This approach ensures that all new partitions define true regions. That
is, for any attribute *a* and value *v* it is not possible to partition on
*a*<= *v* unless it is possible for both objects from the *domain*
with values of *a* greater than *v* and objects with values less than
or equal to
*v* to reach the node being partitioned (even though no objects from the
*training set* will fall within the new partition). In particular,
this ensures that the new cuts are not simple duplications of existing
cuts at ancestors to the current node. Thus, every modification adds
non-redundant complexity to the tree.

This algorithm is presented in Figure 2. It has been implemented as a modification to C4.5 release 6, called C4.5X. The source code for these modifications is available as an on-line appendix to this paper.

**Figure 2:** C4.5X post-processing algorithm

In C4.5X, where multiple sets of values equally satisfy the specified constraints and maximize the Laplace function, values of and that are deeper in the tree are selected over those closer to the root and, at a single node, preference for values of and depends upon the order of attributes in the definition of the data and preference for values of and is dependent upon data order. These selection strategies are a side effect of the implementation of the system. There is no reason to believe that the experimental results would differ in general if other strategies were used to select between competing constraints.

By default, C4.5 develops two decision trees each time that it is run, an unpruned and a pruned (simplified) decision tree. C4.5X produces post-processed versions of both of these trees.

Mon Sep 9 12:13:30 EST 1996