Georgia Tech, 8803 Machine Learning Theory, Spring 2010

MACHINE LEARNING THEORY

Maria Florina Balcan

Time: Tues/Thurs 3:05-4:25, Place: Skiles 168.
Course description: Machine learning studies automatic methods for learning to make accurate predictions or useful decisions based on past observations and experience, and it has become a highly successful discipline with applications in many areas such as natural language processing, speech recognition, computer vision, or gene discovery. This course on the design and theoretical analysis of machine learning methods will cover a broad range of important problems studied in theoretical machine learning. It will provide a basic arsenal of powerful mathematical tools for their analysis, focusing on both statistical and computational aspects. We will examine questions such as "What guarantees can we prove on the performance of learning algorithms? " and "What can we say about the inherent ease or difficulty of learning problems?". In addressing these and related questions we will make connections to statistics, algorithms, complexity theory, information theory, game theory, and empirical machine learning research.

Recommended (but not required) textbooks:

An Introduction to Computational Learning Theory by M. Kearns and U. Vazirani
A Probabilistic Theory of Pattern Recognition by L. Devroye, L. Györfi, G. Lugosi.

Additionally, we will use a number of survery articles and tutorials.

Office hours: Just stop by my office or make an appointment.

Final Exam: Here it the exam. It's due at 4:00 PM on May 6th.

Homeworks

Homework 1 [ps, pdf]. Solutions.
Homework 2 [ps, pdf]. Solutions.
Homework 2.5 [ps, pdf].
Homework 3 [ps, pdf]. Solutions.
Homework 4 [ps, pdf]. Solutions.
Homework 5 [ps, pdf].

Project ideas

Lecture Notes & Handouts

01/12: Introduction. The consistency model and PAC model for passive supervised learning.
See also Chapter 1 in the Kearns-Vazirani book.
01/14: PAC model. Perceptron algorithm for learning linear separators.
01/19: The margin perceptron algorithm. The mistake bound model.
01/21: The mistake bound model (cont'd). The Winnow algorithm.
See also: An Online learning survey by A. Blum. The original Winnow paper by N. Littlestone.
01/26: Tail inequalities. Simple sample complexity results.
01/28: Sample complexity results for infinite hypotheses spaces.
02/02: Sample complexity results for infinite hypotheses spaces(cont'd). Sauer's lemma.
See also Chapter 3 in the Kearns-Vazirani book.
02/04: Lower bounds on the complexity of passive supervised learning.
02/09: Boosting I: Weak to strong learning, Schapire's original method. See Avrim Blum's notes.
See also Chapter 4 in the Kearns-Vazirani book.
02/11: Boosting II: Adaboost. See Rob Schapire's notes.
02/16: Boosting and margins.
02/18: Semi-supervised Learning.
See: A Discriminative Model for Semi-Supervised Learning.
02/23: Generalization bounds based on the Rademacher Complexity.
02/25: Properties of the Rademacher Complexity.
See Introduction to Statistical Learning Theory by O. Bousquet, S. Boucheron, and G. Lugosi.
03/02: Kernels and Margins.
03/04: Kernels and Margins (cont'd).
See: Kernels as Features: On Kernels, Margins, and Low-dimensional Mappings.
03/09: More examples of kernels. ANOVA kernels and diffusion kernels.
03/11: Model Selection.
See Model Selection and Error Estimation by P. Bartlett, S. Boucheron, and G. Lugosi.
03/16: Support Vector Machines See Andrew Ng's notes. See also Boyd’s lectures.
03/18: Limitations of Kernel Methods.
03/29: The Random Classification Noise Model.
04/01: The Random Classification Noise Model (cont'd). The Statistical Query model. See Avrim Blum's notes.
See also Chapter 5 in the Kearns-Vazirani book.
04/06: Fourier-based learning and learning with Membership queries: the KM algorithm. See Ron Rivest's lecture notes.
04/08: Fourier-based learning and learning with Membership queries: the KM algorithm(con't). See Ron Rivest's lecture notes.
See also Yishay Mansour's survey.
04/13: Incorporating Unlabeled Data in the Learning Process.
04/15: Active Learning Slides and Notes.
04/20: Project Presentations.
04/22: Project Presentations.
04/27: Sample Complexity Results for Active Learning. See Shai Shalev-Shwartz's notes.
04/29: The Weighted Majority Algorithm. Course Retrospective.

Additional Readings & More Information

Books and tutorials:

PASCAL video lectures.
Kernel Methods for Pattern Analysis by N. Cristianini and J. Shawe-Taylor. Cambridge University Press, 2004.
An Introduction to Support Vector Machines (and other kernel-based learning methods) by N. Cristianini and J. Shawe-Taylor. Cambridge University Press, 2000.
Learning in Neural Networks: Theoretical Foundations by M. Anthony and P. Bartlett. Cambridge University Press, 1999.
Statistical Learning Theory by V. Vapnik. Wiley, 1998.