CMU 15-859(B), Spring 2010
 MACHINE LEARNING THEORY
 MW 3:00-4:20, GHC 4102
Course description: 
This course will focus on theoretical aspects of machine learning. We
will examine questions such as: What kinds of guarantees can we prove
about learning algorithms?  Can we design algorithms for interesting
learning tasks with strong guarantees on accuracy and amounts of data
needed?  What can we say about the inherent ease or difficulty of
learning problems?  Can we devise models that are both amenable to
theoretical analysis and make sense empirically? Addressing these
questions will bring in connections to probability and statistics,
online algorithms, game theory, complexity theory, information theory,
cryptography, and empirical machine learning research.
Grading will be based on 
6 homework assignments, class
      participation, a small class project, and a take-home final
      (worth about 2 homeworks).  Students from time
      to time will also be asked to help with the grading of
		assignments. 
 
 
Prerequisites: Either 15-781/10-701/15-681
      Machine Learning, or 15-750 Algorithms, or a 
      Theory/Algorithms background or a Machine Learning background.
Text:
An Introduction to Computational Learning Theory by Michael Kearns 
and Umesh Vazirani, plus papers and notes for topics not in the book.
Office hours: Email / stop by anytime!
Handouts
 Lecture Notes & tentative plan
-  01/11: Introduction. PAC model and Occam's
		    razor.
-  01/13: The Mistake-Bound model.  Combining
	  expert advice.  Connections to info theory.
-  01/20: The Winnow algorithm.
-  01/25: The Perceptron Algorithm, margins,
	  & intro to kernels.
-  01/27: Uniform convergence, tail
            inequalities (Chernoff/Hoeffding), VC-dimension I. 
            [more notes]
-  02/01: VC-dimension II (proofs of main theorems).
-  02/03: Boosting I: Weak to strong learning,
	  Schapire's original method. 
    [notes from Shuchi Chawla's course]
-  02/15: Boosting II: Adaboost + connection
	      to WM analysis + L_1 margin bounds.
-  02/17: Rademacher bounds and McDiarmid's inequality.
-  02/22: Support Vector Machines, properties
	  of kernels, MB=>PAC, L_2 margin bounds.
-  02/24: Margins, kernels, and general
	  similarity functions (L_1 and L_2 connection).
-  03/01: Maxent and maximum likelihood exponential
              models; connection to winnow.
-  03/03: Learning with noise and the Statistical Query model I.
-  03/15: Statistical Query model II:
	  characterizing weak SQ-learnability.
-  03/17: Fourier-based learning and learning
	  with Membership queries: the KM algorithm.
-  03/22: Fourier spectrum of decision trees
	  and DNF.  Also hardness of learning parities with kernels.
-  03/24: Learning Finite State
	  Environments. (more notes)
-  03/29: Learning Finite State Environments,
	  contd.
-  03/31: MDPs and reinforcement learning.
-  04/05: Offline->online optimization.
-  04/07: Bandit algorithms and sleeping experts.
-  04/12: Game theory I (zero-sum and
	  general-sum games).  [See also this survey article]
-  04/14: Game theory II (achieving low
	  internal/swap regret.  Congestion/exact-potential games.
-  04/19: Semi-supervised learning.
 Additional Readings & More Information
- Robert Williamson, John Shawe-Taylor, Bernhard Scholkopf, Alex
Smola 
Sample
Based Generalization Bounds.  Gives tighter generalization bounds
where instead of using "the maximum number of ways of labeling a set of 2m
points" you can use "the number of ways of labeling your actual sample". 
- See also a previous version of
      the course.