![]() Avrim Blum
Semi-supervised Learning
Automatic methods for collecting data have in many domains far outstripped the pace of human annotation. In machine learning, the result has been a growing interest in semi-supervised learning: learning algorithms that are able to combine labeled and unlabeled data in a way that usefully leverages a large body of unlabeled data to improve learning from a small labeled sample. In this talk, I will survey a number of very different learning algorithms that have been designed for this task (including Co-Training, Semi-Supervised SVM, and graph-based methods). I will then describe a new theoretical framework for semi-supervised learning that can be used to analyze when unlabeled data can be of help, how much help it can provide, and to place these algorithms in a common context. I will also discuss a number of conceptual issue this model raises. Portions of this talk are joint work with Nina Balcan.
|