next up previous
Next: The Algorithm Up: Towards Noise-Tolerant Windowing Previous: Completeness-Check

Resampling

Our main concern with the resampling problem was that adding only misclassified examples is likely to increase the noise level inside the window. To avoid this, we form a set of candidates containing all examples that are not yet in the window and that are covered by insignificant rules, plus all uncovered positive examples. The algorithm then selects MaxIncSize of these candidate examples and adds them to the window. We stick to adding uncovered positive examples only, because after more and more rules have been discovered, the proportion of positive examples in the remaining training set will considerably decrease, so that the chances of randomly picking a positive example from the set of all uncovered examples would decrease, which in turn might slow down the learner. Although adding only positive uncovered examples may increase the chances of learning over-general rules, these will be discovered by the second part of our criterion and appropriate counter-examples will eventually be added to the window.



next up previous
Next: The Algorithm Up: Towards Noise-Tolerant Windowing Previous: Completeness-Check