Next: The Size of a Up: Building Small Accurate Decision Previous: Building Small Accurate Decision

## The Size of a DC is Measured as its Whole Number of Literals

Theorem 1   When the size of a DC is measured as its whole number of literals, it is -Hard to find the smallest decision committee consistent with a set of examples .

Proof: See the Appendix.
We can easily adapt Theorem 1 to the case where the rules are replaced by weighted DT as advocated in boosted C4.5 [Schapire Singer1998]. Here, each tree returns a class , and each tree is given a real weight to leverage its vote. The sign of the linear combination gives the class of an example. The following theorem holds again with any limitations on the leveraging coefficients (as long as at least one non-zero value is authorized), or without limitation on the coefficients. By this, we mean that for each of the applicable limitations (or without), the problem is -Hard. The size notion is the sum, over all trees, of their number of nodes.

Theorem 2   It is -Hard to find the smallest weighted linear combinations of DT consistent with a set of examples , without limitation on the leveraging coefficients, or for any possible limitation, as long as at least one non-zero value is authorized.

While it is well known that boosting results in a rapid decreasing of the error over which can easily and rapidly drop down to zero (as long as it is possible), Theorem 2 shows that attempts to efficiently reduce the size of the vote when boosting DT is -Hard. If the problem is simplified to the to pruning of a large consistent vote of DT [Margineantu Dietterich1997], to obtain a smaller consistent (or with limited error) vote with restricted size, it is again possible (using the same reduction) to show that this brings -Hardness.

Next: The Size of a Up: Building Small Accurate Decision Previous: Building Small Accurate Decision
©2002 AI Access Foundation and Morgan Kaufmann Publishers. All rights reserved.