# MACHINE LEARNING THEORY

## Avrim Blum

### MW 3:00-4:20, Wean 4623

Course description: This course will focus on theoretical aspects of machine learning. We will examine questions such as: What kinds of guarantees can one prove about learning algorithms? What are good algorithms for achieving certain types of goals? Can we devise models that are both amenable to mathematical analysis and make sense empirically? What can we say about the inherent ease or difficulty of learning problems? Addressing these questions will require pulling in notions and ideas from statistics, complexity theory, information theory, cryptography, game theory, and empirical machine learning research. Grading will be based on 6 homework assignments, class participation, a small class project, and a take-home final (worth about 2 homeworks). Students from time to time will also be asked to help with the grading of assignments. [2008 version]

Text: An Introduction to Computational Learning Theory by Michael Kearns and Umesh Vazirani, plus papers and notes for topics not in the book.

Office hours:MW 4:30-5:15

### Lecture Notes & tentative plan

1. 01/12: Introduction. PAC model and Occam's razor.
2. 01/14: The Mistake-Bound model. Combining expert advice. Connections to info theory and game theory.
3. 01/21: The Winnow algorithm.
4. 01/26: The Perceptron Algorithm, Margins, and Kernel functions.
5. 01/28: Uniform convergence, tail inequalities (Chernoff/Hoeffding), VC-dimension I. [more notes]
6. 02/02: VC-dimension II.
7. 02/04: Rademacher bounds and McDiarmid's inequality.
8. 02/09: Boosting I: weak vs strong learning, basic issues.
9. 02/11: Boosting II: Adaboost + connection to WM analysis + L_1 margin bounds.
10. 02/16: Support Vector Machines, properties of kernels, MB=>PAC, L_2 margin bounds.
11. 02/18: Margins, kernels, and general similarity functions (L_1 and L_2 connection).
12. 02/23: Cryptographic hardness results.
13. 02/25: Maxent and maximum-likelihood exponential models. Connection to winnow.
14. 03/02: Statistical Query model I.
15. 03/04: Statistical Query model II.
16. 03/16: Fourier-based algorithms.
17. 03/18: Membership Query algorithms.
18. 03/23: Membership Query algorithms II.
19. 03/25: Learning finite-state environments.
20. 03/30: Learning finite-state environments II.
21. 04/01: MDPs and reinforcement learning.
22. 04/06: Offline->online optimization.
23. 04/08: Bandit problems.
24. 04/13: Active learning [Steve Hanneke]
25. 04/15: [class cancelled today]
26. 04/20: online learning and game theory.
27. 04/22: Semi-supervised learning.
28. 04/27: Project presentations
29. 04/29: Project presentations

Books and tutorials:

Online Learning:

PAC sample complexity:
• David Haussler Chapter on PAC learning model, and decision-theoretic generalizations, with applications to neural nets. From Mathematical Perspectives on Neural Networks, Lawrence Erlbaum Associates, 1995, containing reprinted material from "Decision Theoretic Generalizations of the PAC Model for Neural Net and Other Learning Applications", Information and Computation, Vol. 100, September, 1992, pp. 78-150. This is a really nice survey of the PAC model and various sample-complexity results.

• David Williamson, John Shawe-Taylor, Bernhard Schölkopf, Alex Smola Sample Based Generalization Bounds. Gives tighter generalization bounds where instead of using "the maximum number of ways of labeling a set of 2m points" you can use "the number of ways of labeling your actual sample".
Boosting:

More on Kernels:

Fourier analysis, weak learning, SQ learning:

Computational hardness results:

Web sites on maxent: