CMU 15-859(B), Spring 2008

MACHINE LEARNING THEORY

Avrim Blum


Time: MW 3:00-4:20, Place: Wean 4623
Office hours: After class and Thurs 2-3 are best for me, but also feel free to stop by my office or make an appointment.

Course description: This course will focus on theoretical aspects of machine learning. We will examine questions such as: What kinds of guarantees can one prove about learning algorithms? What are good algorithms for achieving certain types of goals? Can we devise models that are both amenable to mathematical analysis and make sense empirically? What can we say about the inherent ease or difficulty of learning problems? Addressing these questions will require pulling in notions and ideas from statistics, complexity theory, information theory, cryptography, game theory, and empirical machine learning research. Grading will be based on 6 homework assignments, class participation, a small class project, and a take-home final (worth about 2 homeworks). Students from time to time will also be asked to help with the grading of assignments. [2007 version]

Text: An Introduction to Computational Learning Theory by Michael Kearns and Umesh Vazirani, plus papers and notes for topics not in the book.


Handouts


Lecture Notes & tentative plan

  1. 01/14: Introduction. PAC model and Occam's razor.
  2. 01/16: The Mistake-Bound model. Combining expert advice. Connections to info theory and game theory.
  3. 01/23: The Winnow algorithm + applications.
  4. 01/28: The Perceptron Algorithm, Margins, and Kernel functions.
  5. 01/30: Uniform convergence, tail inequalities (Chernoff/Hoeffding), VC-dimension I. [more notes]
  6. 02/04: VC-dimension II.
  7. 02/06: Boosting I: weak vs strong learning, basic issues. [plus some slides]
  8. 02/11: Boosting II: Adaboost + connection to WM analysis + L_1 margin bounds.
  9. 02/13: Margins, random projection, kernels, and general similarity functions
  10. 02/18: Margin bounds, Support Vector Machines.
  11. 02/20: McDiarmid's inequality & Rademacher bounds.
  12. 02/25: Maxent and maximum-likelihood exponential models. Connection to winnow.
  13. 02/27: Cryptographic hardness results
  14. 03/03: Statistical Query model I
  15. 03/05: Statistical Query model II
  16. 03/17: Fourier-based algorithms
  17. 03/19: Membership Query algorithms
  18. 03/24: Membership Query algorithms II
  19. 03/26: Learning finite-state environments
  20. 03/31: MDPs and reinforcement learning I
  21. 04/02: MDPs and reinforcement learning II
  22. 04/07: Semi-supervised learning
  23. 04/09: Offline->online optimization
  24. 04/14: online learning and game theory I
  25. 04/16: online learning and game theory II
  26. 04/21: Active learning & course retrospective
  27. 04/23: [No Class Today]
  28. 04/28: Project presentations
  29. 04/30: Project presentations

Additional Readings & More Information

Books and tutorials:

Online Learning:

PAC sample complexity: Boosting:

More on Kernels:

Fourier analysis, weak learning, SQ learning:

Computational hardness results:

Web sites on maxent: