CMU 10-806 Foundations of Machine Learning and Data Science, Fall 2015

Instructors: Nina Balcan and Avrim Blum

Mon/Wed 4:30-5:50, GHC 4303


Course description: This course will cover fundamental topics in Machine Learning and Data Science, including powerful algorithms with provable guarantees for making sense of and generalizing from large amounts of data. The course will start by providing a basic arsenal of useful statistical and computational tools, including generalization guarantees, core algorithmic methods, and fundamental analysis models. We will examine questions such as: Under what conditions can we hope to meaningfully generalize from limited data? How can we best combine different kinds of information such as labeled and unlabeled data, leverage multiple related learning tasks, or leverage multiple types of features? What can we prove about methods for summarizing and making sense of massive datasets, especially under limited memory? We will also examine other important constraints and resources in data science including privacy, communication, and taking advantage of limited interaction. In addressing these and related questions we will make connections to statistics, algorithms, linear algebra, complexity theory, information theory, optimization, game theory, and empirical machine learning research. [More info] [People and office hours]


Homeworks


Take-home final

You can take the test in any 24-hour period you want up unil Fri Dec 18 (i.e., midnight Dec 18 is the latest hand-in date).
Here is the take-home final.

Projects [Project ideas]

Project poster presentations will be Thursday December 17, 4-6pm in the GHC 7th floor Atrium. If you cannot make that time, please come talk with us.

Writeups due by Sunday December 20.


Lecture Notes & Handouts

  1. 09/09: Introduction. The consistency model.
    See also Chapter 1 in the Kearns-Vazirani book.
  2. 09/14: The PAC model for passive supervised learning.
    See also Chapter 1 in the Kearns-Vazirani book.
  3. 09/16: Effective number of hypotheses, VC-dimension, and Sauer's lemma.
    See also Chapter 3 in the Kearns-Vazirani book.
  4. 09/21: Sample complexity results for infinite hypothesis spaces.
    See also Chapter 3 in the Kearns-Vazirani book.
  5. 09/23: Sample complexity results for infinite hypothesis spaces (cont'd).
    See also Chapter 3 in the Kearns-Vazirani book.
  6. 09/28: Sample complexity lower bounds for passive supervised learning.
    See also Chapter 3.6 in the Kearns-Vazirani book.
  7. 09/30: Sample Complexity results for the agnostic case.
  8. 10/05: Generalization bounds based on Rademacher complexity.
    See also Chapter 3 in the Mohri, Rostamizadeh, and Talwalkar book.
    See also the survey Introduction to Statistical Learning Theory by O. Bousquet, S. Boucheron, and G. Lugosi.
  9. See also the survey Theory of Classification: A Survey of some recent advances by O. Bousquet, S. Boucheron, and G. Lugosi.
  10. 10/07: Computational hardness results.
    See also Ch. 1.4, 1.5, and 6.1 in the Kearns-Vazirani book. More resources on NP-hardness: 1, 2.
  11. 10/12: Online learning and optimization I: mistake-bounds and combining expert advice.
    Further readings: book chapter
  12. 10/14: Online learning and optimization II: ERM and Follow the Regularized Leader.
    See also Shalev-Shwartz monograph
  13. 10/19: Online learning and optimization III: FTRL contd, and Follow the Perturbed Leader.
  14. 10/21: Online learning and optimization IV: FPL contd, and the multi-armed bandit setting.
  15. 10/26: Boosting: weak-learning, strong-learning, and adaboost.
    See also Chapter 4 in the Kearns-Vazirani book and Chapter 6 in the Mohri-Rostamizadeh-Talwalkar book.
    See also Rob Schapire's notes.
  16. 10/28: Learning and game theory.
  17. 11/02: Learning and game theory.
    See also this book chapter
  18. 11/04: Streaming Algorithms: estimating frequency counts, the count-min sketch, begin distinctness counting.
    See also Muthukrishnan lecture notes, Chakrabarti lecture notes
  19. 11/09: Streaming Algorithms II: distinctness counting, frequency moments.
  20. 11/11: The Johnson Lindenstrauss Lemma and tensor methods.
    See also Moitra notes, Dasgupta notes
  21. 11/16: Foundations of Active Learning Intro slides and Lecture Notes.
    See also the survey Two Faces of Active Learning by S. Dasgupta.
    See also the survey Active Learning Survey by Balcan and Urner.
  22. 11/18: Disagreement Based Active learning.
    See also the survey Theory of Disagreement-Based Active Learning by S. Hanneke.
  23. 11/23: Active Learning of Linear Separators and Slides.
  24. 11/30: Semi-Supervised Learning.
    See also this survey article
  25. 12/02: Semi-Supervised Learning and brief discussion of multilayer networks.
  26. 12/07: Differential privacy and Statistical Query learning and Slides.
  27. 12/09: Distributed learning
    See also: Jordan and Mitchell, Machine learning: Trends, Perspectives, and Prospects.