- Instructor: Avrim Blum
- Time: TR 1:30-2:50
- Place: Wean 5409

**Text:**
*
An Introduction to Computational Learning Theory* by Michael Kearns
and Umesh Vazirani, plus papers and notes for topics not in the book.

**Office hours:** Just stop by my office or make an appointment.

**Note:** This is the 2004 version of the
Machine Learning Theory course. For the 2007
version, click here.

- General Course Information
- Handout on Tail inequalities.
- Homework 1. Solutions.
- Homework 2. Solutions.
- Homework 3. Solutions.
- Homework 4. Solutions.
- Homework 4.5.
- Homework 5. Solutions.
- Homework 6. Solutions.
- Info on projects and presentations.

- 01/13: Introduction, basic definitions, the consistency model.
- 01/15: The Mistake-bound model, relation to consistency, halving and Std Opt algorithms.
- 01/20: The weighted majority algorithm (combining "expert" advice) & applications.
- 01/22: The Perceptron and Winnow algorithms.
- 01/27: The PAC model: basic results and Occam's razor.
- 01/29: Uniform convergence, Chernoff/Hoeffding bounds, MB=>PAC, greedy set-cover. Handout on Tail inequalities.
- 02/03: VC-dimension. Proof of Sauer's lemma. slides and notes.
- 02/05: VC-dimension, contd. Proof of main thm.
- 02/10: Boosting I: weak vs strong learning, basic issues, Schapire's original algorithm.
- 02/12: Boosting II: Adaboost. Use as proof of Hoeffding bounds.
- 02/17: Learning from noisy data. SQ model intro.
- 02/19: SQ => PAC + noise. SQ-dimension. Begin hardness results for SQ algorithms.
- 02/24: Characterizing SQ learnability with Fourier analysis, contd. (same notes as last time)
- 02/26: Weak-learning of monotone functions.
- 03/02: Membership queries I: basic algorithms (monotone DNF, log-n var functions); begin KM for finding large fourier coeffs.
- 03/04: Membership queries II: finish KM algorithm. Fourier spectrum of Decision Trees and DNF.
- 03/16: Membership queries III: Bshouty's alg for decision trees and XOR-of-terms. Cryptographic lower bounds.
- 03/18: Angluin's algorithm for learning DFAs (ppt). More notes.
- 03/23: Learning finite-state environments without a reset. Also slides.
- 03/25: MDPs and reinforcement learning.
- 03/30: More on MDPs. (short class today)
- 04/01: Kalai-Vempala algorithm for online optimization.
- 04/06: Maxent and maximum-likelihood exponential models. Connection to winnow.
- 04/08: Sham Kakade guest lecture: Deterministic Calibration and Nash Equilibrium
- 04/13: Linear separators, Margins, & Kernels.

**Online Learning**

- Nick Littlestone, Learning Quickly
when Irrelevant Attributes Abound: A New Linear-threshold Algorithm.
*Machine Learning*2:285--318, 1987. (The version pointed to here is the tech report UCSC-CRL-87-28.) This is the paper that first defined the Mistake-bound model, and also introduced the Winnow algorithm. A great paper. - Littlestone and Warmuth, The Weighted Majority Algorithm.
*Information and Computation*108(2):212-261, 1994. (The version pointed to here is the tech report UCSC-CRL-91-28.) Introduces the weighted majority algorithm, along with a number of variants. Also a great paper. - Nicoḷ Cesa-Bianchi, Yoav Freund, David Haussler, David
P. Helmbold, Robert E. Schapire, and Manfred K. Warmuth. How
to use expert advice. Journal of the ACM, 44(3):427-485, May 1997.
Continuing on with line of research in the [LW] paper, this gives
tighter analyses of multiplicative-weighting expert algorithms and
addresses a number of issues.
- Yoav Freund and Robert E. Schapire. Adaptive game playing using
multiplicative weights. Games and Economic Behavior, 29:79-103,
1999.
- Avrim Blum, On-Line
Algorithms in Machine Learning (a survey). From "Online
Algorithms: the state of the art", Fiat
and Woeginger eds., LNCS #1442, 1998.
- Adam Kalai and Santosh Vempala, Efficient
algorithms for the online decision problem, COLT '03. Earlier
version (a little simpler and closer to notation used in class): Follow
the Leader for Online Optimization.

- My FOCS'03 tutorial on Machine
Learning Theory
- David Haussler Chapter on PAC learning model, and decision-theoretic generalizations, with applications to neural nets. From Mathematical
Perspectives on Neural Networks, Lawrence Erlbaum Associates, 1995, containing reprinted material from "Decision Theoretic
Generalizations of the PAC Model for Neural Net and Other Learning Applications", Information and Computation, Vol. 100,
September, 1992, pp. 78-150. This is a really nice survey of the PAC
model and various sample-complexity results.

- The original paper: Yoav Freund and Rob Schapire, A decision-theoretic generalization of on-line learning and an application to boosting, Journal of Computer and System Sciences, 55(1):119-139, 1997.
- A recent overview by Rob Schapire: The boosting approach to machine learning: An overview. In MSRI Workshop on Nonlinear Estimation and Classification, 2002.

**Fourier analysis, weak learning, SQ learning**

- Avrim Blum, Merrick Furst, Jeffrey Jackson, Michael Kearns, Yishay Mansour, and Steven Rudich, Weakly Learning DNF and Characterizing Statistical Query Learning Using Fourier Analysis., STOC '94 pp. 253--262.
- Y. Mansour. Learning
Boolean Functions via the Fourier Transform. Survey article in
``Theoretical Advances in Neural Computation and Learning", 391--424
(1994).
- A. Blum, C. Burch, and J. Langford, On
Learning Monotone Boolean Functions. Proceedings of the
39th Annual Symposium on Foundations of Computer Science (FOCS '98).