CMU 15-859(B), Spring 2010

MACHINE LEARNING THEORY

Avrim Blum

MW 3:00-4:20, GHC 4102

Course description: This course will focus on theoretical aspects of machine learning. We will examine questions such as: What kinds of guarantees can we prove about learning algorithms? Can we design algorithms for interesting learning tasks with strong guarantees on accuracy and amounts of data needed? What can we say about the inherent ease or difficulty of learning problems? Can we devise models that are both amenable to theoretical analysis and make sense empirically? Addressing these questions will bring in connections to probability and statistics, online algorithms, game theory, complexity theory, information theory, cryptography, and empirical machine learning research. Grading will be based on 6 homework assignments, class participation, a small class project, and a take-home final (worth about 2 homeworks). Students from time to time will also be asked to help with the grading of assignments.

Prerequisites: Either 15-781/10-701/15-681 Machine Learning, or 15-750 Algorithms, or a Theory/Algorithms background or a Machine Learning background.

Text: An Introduction to Computational Learning Theory by Michael Kearns and Umesh Vazirani, plus papers and notes for topics not in the book.

Office hours: Email / stop by anytime!

Handouts

[Online learning survey] [Winnow paper] [Winnow vs Perceptron] [Handout on tail inequalities]
Homework 1. Solutions.
Homework 2. Solutions.
Homework 3. Solutions.
Homework 4. Solutions.
Homework 4.5 (plan for your course project). Due March 22.
Homework 5. Solutions.
Homework 6. Solutions.

Lecture Notes & tentative plan

Additional Readings & More Information

Robert Williamson, John Shawe-Taylor, Bernhard Scholkopf, Alex Smola Sample Based Generalization Bounds. Gives tighter generalization bounds where instead of using "the maximum number of ways of labeling a set of 2m points" you can use "the number of ways of labeling your actual sample".
See also a previous version of the course.