Georgia Tech, 8803 Machine Learning Theory, Spring 2010

MACHINE LEARNING THEORY

Maria Florina Balcan


Time: Tues/Thurs 3:05-4:25, Place: Skiles 168.

Course description: Machine learning studies automatic methods for learning to make accurate predictions or useful decisions based on past observations and experience, and it has become a highly successful discipline with applications in many areas such as natural language processing, speech recognition, computer vision, or gene discovery. This course on the design and theoretical analysis of machine learning methods will cover a broad range of important problems studied in theoretical machine learning. It will provide a basic arsenal of powerful mathematical tools for their analysis, focusing on both statistical and computational aspects. We will examine questions such as "What guarantees can we prove on the performance of learning algorithms? " and "What can we say about the inherent ease or difficulty of learning problems?". In addressing these and related questions we will make connections to statistics, algorithms, complexity theory, information theory, game theory, and empirical machine learning research.

Recommended (but not required) textbooks:

Additionally, we will use a number of survery articles and tutorials.

Office hours: Just stop by my office or make an appointment.

Final Exam: Here it the exam. It's due at 4:00 PM on May 6th.


Homeworks


Project ideas


Lecture Notes & Handouts

  1. 01/12: Introduction. The consistency model and PAC model for passive supervised learning.
    See also Chapter 1 in the Kearns-Vazirani book.
  2. 01/14: PAC model. Perceptron algorithm for learning linear separators.
  3. 01/19: The margin perceptron algorithm. The mistake bound model.
  4. 01/21: The mistake bound model (cont'd). The Winnow algorithm.
    See also: An Online learning survey by A. Blum. The original Winnow paper by N. Littlestone.
  5. 01/26: Tail inequalities. Simple sample complexity results.
  6. 01/28: Sample complexity results for infinite hypotheses spaces.
  7. 02/02: Sample complexity results for infinite hypotheses spaces(cont'd). Sauer's lemma.
    See also Chapter 3 in the Kearns-Vazirani book.
  8. 02/04: Lower bounds on the complexity of passive supervised learning.
  9. 02/09: Boosting I: Weak to strong learning, Schapire's original method. See Avrim Blum's notes.
    See also Chapter 4 in the Kearns-Vazirani book.
  10. 02/11: Boosting II: Adaboost. See Rob Schapire's notes.
  11. 02/16: Boosting and margins.
  12. 02/18: Semi-supervised Learning.
    See: A Discriminative Model for Semi-Supervised Learning.
  13. 02/23: Generalization bounds based on the Rademacher Complexity.
  14. 02/25: Properties of the Rademacher Complexity.
    See Introduction to Statistical Learning Theory by O. Bousquet, S. Boucheron, and G. Lugosi.
  15. 03/02: Kernels and Margins.
  16. 03/04: Kernels and Margins (cont'd).
    See: Kernels as Features: On Kernels, Margins, and Low-dimensional Mappings.
  17. 03/09: More examples of kernels. ANOVA kernels and diffusion kernels.
  18. 03/11: Model Selection.
    See Model Selection and Error Estimation by P. Bartlett, S. Boucheron, and G. Lugosi.
  19. 03/16: Support Vector Machines See Andrew Ng's notes. See also Boyd’s lectures.
  20. 03/18: Limitations of Kernel Methods.
  21. 03/29: The Random Classification Noise Model.
  22. 04/01: The Random Classification Noise Model (cont'd). The Statistical Query model. See Avrim Blum's notes.
    See also Chapter 5 in the Kearns-Vazirani book.
  23. 04/06: Fourier-based learning and learning with Membership queries: the KM algorithm. See Ron Rivest's lecture notes.
  24. 04/08: Fourier-based learning and learning with Membership queries: the KM algorithm(con't). See Ron Rivest's lecture notes.
    See also Yishay Mansour's survey.
  25. 04/13: Incorporating Unlabeled Data in the Learning Process.
  26. 04/15: Active Learning Slides and Notes.
  27. 04/20: Project Presentations.
  28. 04/22: Project Presentations.
  29. 04/27: Sample Complexity Results for Active Learning. See Shai Shalev-Shwartz's notes.
  30. 04/29: The Weighted Majority Algorithm. Course Retrospective.

Additional Readings & More Information

Books and tutorials: