Thu Feb 10, 1:30, WeH 4601


		      Greedy Attribute Selection
		     Rich Caruana & Dayne Freitag


Many real-world domains bless us with a plethora of attributes to use
for learning.  This blessing is often a curse: many inductive methods
generalize worse given too many attributes than if given a {\em good}
subset of those attributes.  We examine this problem for two learning
tasks in Mitchell's Calendar Apprentice System.  We show that ID3
generalizes poorly on these tasks if allowed to use all available
attributes.  We examine five greedy attribute selection procedures
that search for attribute sets that generalize well when given to ID3.
Experiments suggest these procedures can yield large improvements in
generalization performance.  We present a decision tree caching scheme
to make these procedures more practical by substantially reducing the
their computational cost.  We also compare our results with FOCUS's
MIN-FEATURES bias on the two tasks.