TOPICS IN MACHINE LEARNING THEORY

Avrim Blum

Wed/Fri 11:00-12:15, SC 1109

(Office hours: Wed 4:00-5:00, SC 3212)

Course description: This seminar class will focus on new results and directions in machine learning theory. Machine learning theory concerns questions such as: What kinds of guarantees can we prove about practical machine learning methods, and can we design algorithms achieving desired guarantees? (Why) is Occam's razor a good idea and what does that even mean? What can we say about the inherent ease or difficulty of different types of learning problems? Addressing these questions will bring in connections to probability and statistics, online algorithms, game theory, computational geometry, and empirical machine learning research.

The first half of the course will involve the instructor presenting some classic results and background including regret guarantees, combining expert advice, Winnow and Perceptron algorithms, VC-dimension, Rademacher complexity, SVMs, and Kernel functions. The second half will involve student-led discussions of recent papers in areas such as deep learning, multi-task learning, tensor methods, structured prediction, dictionary learning, and other topics

Lecture Notes and Tentative Plan [Schedule and Signup Sheet for the rest of the Semester]

1. 08/27: Introduction. The PAC model and Occam's razor.
2. 08/29: The Online Mistake-Bound model, Combining Expert Advice / Multiplicative Weights, Regret Minimization and connections to Game Theory.
3. 09/03: Sleeping/Shifting experts, the Winnow Algorithm, L_1 margin bounds. [(old) survey article on online learning]
4. 09/05: The Perceptron Algorithm, Margins, and intro to Kernels. [more notes]
5. 09/10: Do exercises 2,3 and problems 4,5 on this problem set. If you have time, see if you can solve problem 6 (but you don't need to write it up). Also if you have time, look over exercise 1 (but you don't need to write it up). See group-work protocol below.
6. 09/12: Do problems 3,4,5 on this problem set. Problem 3 is the hardest. If you have time, do exercise 1 (and think about the total weight). See group-work protocol below.
7. 09/17: VC-dimension, Sauer's lemma, and Uniform Convergence bounds. [more notes]
8. 09/19: Rademacher bounds: Uniform Convergence II.
9. 09/24: Boosting. [more notes]
10. 09/26: Kernels, similarity measures, and representations.
11. 10/01: Learning Finite-State Environments.
12. 10/03: Online linear optimization.
13. 10/08: Bandit algorithms, internal regret, game theory.
14. 10/10: Do problems 2,3,4 (and ideally also 5) on this problem set. Problem 5 is the hardest.
15. 10/15: Semi-Supervised Learning.
16. 10/17: Do problem 5 from the previous problem set (hint: think about decision lists). Do exercise 1 on this problem set.
17. 10/22: Kai-Wei Chang: Robust multi-objective learning with mentor feedback.
18. 10/24: Ching-Hua Yu: Bandits with switching costs.
19. 10/29: Ashish Khetan, Yingziang Yang, Shaileshh Bojja: Online learning with composite loss functions.
20. 10/31: Liwei Wang: Provable bounds for learning some deep representations.
21. 11/05: Hsien-Chih Chang, Chao Xu: Uniqueness of ordinal embedding.
22. 11/07: Anh Truong, Jalal Etesami: Multiarmed bandits with limited expert advice.
23. 11/12: Ching-Pei Lee: The sample complexity of agnostic learning under deterministic labels
24. 11:14: Sheng Wang, Xueqing Liu, Ning Xu: Boosting with Online Binary Learners for the Multiclass Bandit Problem
25. 11/19: Cem Subakan, John Wieting: Uniqueness of Tensor Decompositions with Applications to Polynomial Identifiability
26. 11/21: Yihan Gao, Haoruo Peng, Fangbo Tao: Logistic Regression: Tight Bounds for Stochastic and Online Optimization
27. 12/03: Kristen Vaccaro, Ismini Lourentzou, Casey Hanson: Belief Propagation, Robust Reconstruction and Optimal Recovery of Block Models
28. 12/05: Stephen Mayhew, Shyam Upadhyay, Adam Vollrath: On the duality of strong convexity and strong smoothness
29. 12/10: Kent and Daniel:
Protocol for in-class problem sets: Here is the process for the in-class problem-set work. For each class we will have 2 "readers/facilitators". Their job is to display the problems on the projector, read them out to the class, and coordinate. After each problem is read, the class as a whole works together on solving it. Once (somebody believes) the problem has been solved, somebody gets up and explains the solution to the rest of the class (which then either agrees or finds a bug and the process repeats). Once the problem has been solved, 1-3 people volunteer to write up the solution and send it to the readers/facilitators. Please include names of all those who contributed to the solution in the writeup. The readers/facilitators will combine the solutions received into a single document and email it to me (avrim@cs.cmu.edu).

MACHINE LEARNING THEORY

Avrim Blum

MW 10:30-11:50, GHC 4303

Course description: This course will focus on theoretical aspects of machine learning. We will examine questions such as: What kinds of guarantees can we prove about learning algorithms? Can we design algorithms for interesting learning tasks with strong guarantees on accuracy and amounts of data needed? What can we say about the inherent ease or difficulty of learning problems? Can we devise models that are both amenable to theoretical analysis and make sense empirically? Addressing these questions will bring in connections to probability and statistics, online algorithms, game theory, complexity theory, information theory, cryptography, and empirical machine learning research. Grading will be based on 6 homework assignments, class participation, a small class project, and a take-home final (worth about 2 homeworks). Students from time to time will also be asked to help with the grading of assignments.
[2009 version of the course]

Prerequisites: A Theory/Algorithms background or a Machine Learning background.

Text (recommended): An Introduction to Computational Learning Theory by Michael Kearns and Umesh Vazirani, plus papers and notes for topics not in the book.

Office hours: Wed 3-4 or send email to make an appointment.

Tentative plan

1. 01/13: Introduction. PAC model and Occam's razor.
2. 01/15: The Mistake-Bound model. Combining expert advice. Connections to info theory.
3. 01/20: The Winnow algorithm.
4. 01/22: The Perceptron Algorithm, margins, & intro to kernels plus Slides.
5. 01/27: Uniform convergence, tail inequalities (Chernoff/Hoeffding), VC-dimension I. [more notes]
6. 01/29: VC-dimension II (proofs of main theorems).
7. 02/03: Boosting I: Weak to strong learning, Schapire's original method.
8. 02/05: Boosting II: Adaboost + connection to WM analysis + L_1 margin bounds
9. 02/10: Rademacher bounds and McDiarmid's inequality.
11. 02/17: MB=>PAC, Support Vector Machines, L_2 margin uniform-convergence bounds.
12. 02/19: Margins, kernels, and general similarity functions (L_1 and L_2 connection).
13. 02/24: No class today. Nina Balcan talk at 10:00am in GHC 6115.
14. 02/26: Learning with noise and the Statistical Query model I.
15. 03/03: No class today. Open house.
16. 03/05: Statistical Query model II: characterizing weak SQ-learnability.
17. 03/17: Fourier-based learning and learning with Membership queries: the KM algorithm.
18. 03/19: Fourier spectrum of decision trees and DNF. Also hardness of learning parities with kernels.
19. 03/24: Learning Finite State Environments.
20. 03/26: MDPs and reinforcement learning.
21. 03/31: Maxent and maximum likelihood exponential models; connection to winnow
22. 04/02: Offline->online optimization, Kalai-Vempala.
23. 04/07: The Adversarial Multi-Armed Bandit Setting.
24. 04/09: Game theory I (zero-sum and general-sum games).
25. 04/14: Game theory II (achieving low internal/swap regret. Congestion/exact-potential games).
26. 04/16: Semi-supervised learning
27. 04/21: Some loose ends: Compression bounds, Bshouty's algorithm.
28. 04/23: Project presentations
29. 04/28: Project presentations
30. 04/30: Project presentations