Next: A Brief History of Up: Integrative Windowing Previous: Integrative Windowing

Introduction

Windowing is a sub-sampling technique proposed by [51] for the ID3 decision tree learning algorithm. Its goal is to reduce the complexity of a learning problem by identifying an appropriate subset of the original data, from which a theory of sufficient quality can be learned. For this purpose, it maintains a subset of the available data, the so-called window, which is used as the training set for the learning algorithm. The window is initialized with a small random sample of the available data, and the learning algorithm induces a theory from this sample. This theory is then tested on the remaining examples. If the quality of the learned theory is not sufficient, the window is adjusted, usually by adding more examples from the training data, and a new theory is learned. This process is repeated until a theory of sufficient quality has been found.

There are at least three motivations for studying windowing techniques:

Memory Limitations:: Almost all learning algorithms still require to have all training examples and all background knowledge in main memory. Although memory has become cheap and the capacity of the main memory of the available hardware platforms is increasing rapidly, there certainly are datasets too big to fit into the main memory of conventional computer systems.
Efficiency Gain:: Learning time usually increases (most often super-linearly) with the complexity of a learning problem. Reducing this complexity may be necessary to make a learning problem tractable.
Accuracy Gain:: It has been observed that windowing may also lead to an increase in predictive accuracy. A possible explanation for this phenomenon is that learning from a subset of examples may often result in a less over-fitting theory.

In this paper, our major concern is the appropriateness of windowing techniques for increasing the efficiency of inductive rule learning algorithms, such as those using the popular separate-and-conquer (or covering) learning strategy [44,13,52,31]. We will argue that windowing is more suitable for these algorithms than for divide-and-conquer decision-tree learning [51,53] (section 3. Thereafter, we will introduce integrative windowing, a technique that exploits the fact that rule learning algorithms learn each rule independently. We will show that this method allows to significantly improve the performance of windowing by integrating good rules learned from different iterations of the basic windowing procedure into the final theory (section 4). While we have primarily worked with noise-free domains, section 5 will discuss the problem of noise in windowing as well as some preliminary work that shows how windowing techniques can be adapted for noisy domains. Parts of this work have previously appeared as [28,29,26].

Next: A Brief History of Up: Integrative Windowing Previous: Integrative Windowing