Scroll down for CMU 15-859(B) Machine Learning Theory, Spring 2014
UIUC CS-589, Fall 2014
TOPICS IN MACHINE LEARNING THEORY
Wed/Fri 11:00-12:15, SC 1109
(Office hours: Tues 1:00-2:00, SC 3212)
This seminar class will focus on new results and directions in machine
learning theory. Machine learning theory concerns questions such as:
What kinds of guarantees can we prove about practical machine learning
methods, and can we design algorithms achieving desired guarantees?
(Why) is Occam's razor a good idea and what does that even mean? What
can we say about the inherent ease or difficulty of different types of
learning problems? Addressing these questions will bring in
connections to probability and statistics, online algorithms, game
theory, computational geometry, and empirical machine learning
The first half of the course will involve the instructor presenting
some classic results and background including regret guarantees,
combining expert advice, Winnow and Perceptron algorithms,
VC-dimension, Rademacher complexity, SVMs, and Kernel functions. The
second half will involve student-led discussions of recent papers in
areas such as deep learning, multi-task learning, tensor methods,
structured prediction, dictionary learning, and other topics
Lecture Notes and Tentative Plan
- 08/27: Introduction. The PAC model and Occam's
- 08/29: The Online Mistake-Bound model, Combining
Expert Advice / Multiplicative Weights,
Regret Minimization and connections to Game Theory.
- 09/03: Shifting experts, the Winnow
Algorithm, L_1 margin bounds.
- 09/05: The Perceptron Algorithm, Margins, and intro to Kernels.
CMU 15-859(B), Spring 2014
MACHINE LEARNING THEORY
MW 10:30-11:50, GHC 4303
This course will focus on theoretical aspects of machine learning. We
will examine questions such as: What kinds of guarantees can we prove
about learning algorithms? Can we design algorithms for interesting
learning tasks with strong guarantees on accuracy and amounts of data
needed? What can we say about the inherent ease or difficulty of
learning problems? Can we devise models that are both amenable to
theoretical analysis and make sense empirically? Addressing these
questions will bring in connections to probability and statistics,
online algorithms, game theory, complexity theory, information theory,
cryptography, and empirical machine learning research.
Grading will be based on
6 homework assignments, class
participation, a small class project, and a take-home final
(worth about 2 homeworks). Students from time
to time will also be asked to help with the grading of
[2009 version of the course]
Prerequisites: A Theory/Algorithms background or a Machine
An Introduction to Computational Learning Theory by Michael Kearns
and Umesh Vazirani, plus papers and notes for topics not in the book.
Office hours: Wed 3-4 or send email to make an appointment.
- 01/13: Introduction. PAC model and Occam's
- 01/15: The Mistake-Bound model. Combining
expert advice. Connections to info theory.
- 01/20: The Winnow algorithm.
- 01/22: The Perceptron Algorithm, margins,
& intro to kernels plus Slides.
- 01/27: Uniform convergence, tail
inequalities (Chernoff/Hoeffding), VC-dimension I.
- 01/29: VC-dimension II (proofs of main theorems).
- 02/03: Boosting I: Weak to strong learning,
Schapire's original method.
- 02/05: Boosting II: Adaboost + connection
to WM analysis + L_1 margin bounds
- 02/10: Rademacher bounds and McDiarmid's inequality.
- 02/12: Rademacher bounds II.
- 02/17: MB=>PAC, Support Vector Machines,
L_2 margin uniform-convergence bounds.
- 02/19: Margins, kernels, and general
similarity functions (L_1 and L_2 connection).
- 02/24: No class today. Nina Balcan talk at 10:00am in GHC 6115.
- 02/26: Learning with noise and the Statistical Query model I.
- 03/03: No class today. Open house.
- 03/05: Statistical Query model II:
characterizing weak SQ-learnability.
- 03/17: Fourier-based learning and learning
with Membership queries: the KM algorithm.
- 03/19: Fourier spectrum of decision trees
and DNF. Also hardness of learning parities with kernels.
- 03/24: Learning Finite State
- 03/26: MDPs and reinforcement learning.
- 03/31: Maxent and maximum likelihood exponential
models; connection to winnow
- 04/02: Offline->online optimization, Kalai-Vempala.
- 04/07: The Adversarial Multi-Armed Bandit Setting.
- 04/09: Game theory I (zero-sum and
- 04/14: Game theory II (achieving low
internal/swap regret. Congestion/exact-potential games).
- 04/16: Semi-supervised learning
- 04/21: Some loose ends: Compression bounds, Bshouty's algorithm.
- 04/23: Project presentations
- 04/28: Project presentations
- 04/30: Project presentations
Additional Readings & More Information
- O. Bousquet, S. Boucheron, and G. Lugosi, Introduction
to Statistical Learning Theory.
- N. Cristianini and J. Shawe-Taylor,
Methods for Pattern Analysis, 2004.
- N. Cristianini and J. Shawe-Taylor,
An Introduction to Support
Vector Machines (and other kernel-based learning methods), 2000.
- Nick Littlestone Learning Quickly when Irrelevant Attributes Abound: A new Linear-threshold Algorithm. This is the Winnow paper from 1987, which also discusses a number of aspects of the online mistake bound model.
- The Adaboost paper: Yoav Freund and Rob Schapire,
A decision-theoretic generalization of on-line learning and an application to boosting, Journal of Computer and System Sciences, 55(1):119-139, 1997.
- Robert Williamson, John Shawe-Taylor, Bernhard Scholkopf, Alex
Based Generalization Bounds. Gives tighter generalization bounds
where instead of using "the maximum number of ways of labeling a set of 2m
points" you can use "the number of ways of labeling your actual sample".
- Maria-Florina Balcan, Avrim Blum, and Nathan Srebro Improved Guarantees for Learning via Similarity Functions. Gives formulation and analysis for learning with general similarity functions. Also shows that for any class of large SQ dimension, there cannot be a kernel that has large margin even for all (or even a non-negligible fraction) of the functions.
- Avrim Blum, Merrick Furst, Jeffrey Jackson, Michael Kearns, Yishay
Mansour, and Steven Rudich Weakly
Learning DNF and Characterizing Statistical Query Learning
Using Fourier Analysis. Defines and analyzes SQ
dimension. Also weak-learning of DNF via fourier analysis.
- PASCAL video lectures.
- Avrim Blum and Yishay Mansour
Learning, Regret Minimization, and
Equilibria, Chapter 4 of Algorithmic Game
Theory, Noam Nisan, Tim Roughgarden, Eva
Tardos, and Vijay Vazirani, eds.