Date: Tue, 10 Dec 1996 14:53:46 GMT Server: NCSA/1.4.2 Content-type: text/html Last-modified: Wed, 03 Jan 1996 23:13:43 GMT Content-length: 3637
Principal Investigators: Richard Segal and Oren Etzioni .
Brute differs from existing data mining and classification algorithms in that it uses massive search rather than greedy search. Massive search can avoid many of the pitfalls of greedy search, albeit at additional cost. Empirical analysis shows that brute performs much better than greedy algorithms for data mining and has similar performance when used for classification. Surprisingly, Brute's running time is often quite reasonable.
Brute uses several pruning axioms to reduce the size of the space it must search. These axioms are sound in that they only remove portions of the search space guaranteed not to contain useful rules. Brute can commonly reduce the search space by a factor of a 1,000 or more.
Brute supports a wide variety of data formats. Brute can be used with minimal effort on databases from the UCI repository, C4.5 databases, and IND databases. A program is provided for automatically creating attribute description files that makes it easy to use Brute on new data sets.
P. Riddle, R. Segal, and O. Etzioni. Representation design and brute-force induction in a Boeing manufacturing domain. Applied Artificial Intelligence, 8:125-147, 1994.
R. Segal and O. Etzioni. Learning decision lists using homogeneous rules. In Proceedings of the Twelfth National Conference on Artificial Intelligence, July, 1994.