Machine Learning, 10-701 and 15-781

Prof. Carlos Guestrin
School of Computer Science, Carnegie Mellon University

Spring 2007

Class lectures: Mondays & Wednesdays from 10:30-11:50 in Wean Hall 7500

Review sessions: Thursdays 5:30-6:50 in Wean Hall 5409



It is hard to imagine anything more fascinating than automated systems that improve their own performance. The study of learning from data is commercially and scientifically important. This course is designed to give a graduate-level student a thorough grounding in the methodologies, technologies, mathematics and algorithms currently needed by people who do research in learning and data mining or who may need to apply learning or data mining techniques to a target problem. The topics of the course draw from classical statistics, from machine learning, from data mining, from Bayesian statistics and from statistical algorithmics.

Students entering the class should have a pre-existing working knowledge of probability, statistics and algorithms, though the class has been designed to allow students with a strong numerate background to catch up and fully participate.


Page links


Instructor


Teaching Assistants

Questions

The first point of contact for questions pertaining to homework problems is according to the following schedule. Please contact the TA specific to the homework problem that you have a question about. Also, questions may be emailed to 10701-instructors@cs.cmu.edu.

HMWK #1
Out: 24-Jan
In: 7-Feb

Assignment: [PDF]
Solutions: [PDF]
UCI Breast Cancer dataset [zip]
HMWK #2
Out: 7-Feb
In: 21-Feb

Assignment [PDF]
Solutions: [PDF]
Bupa dataset [zip]
HMWK #3
Out: 21-Feb
In: Monday 5-Mar
(no late days allowed)

Assignment: [PDF]
Solutions: [PDF]
libsvm download: [link]
Matlab and data files: [zip]
MIDTERM
Solutions [PDF]
HMWK #4
Out: 28-Mar
In: 11-Apr

Assignment: [PDF]
Solutions: [PDF]
Matlab and data files [zip]
HMWK #5
Out: 11-Apr
In: 25-Apr

Assignment: [PDF]
Matlab and data files [zip]
Solutions: [PDF]

Adminstrative Assistant

Textbooks


Announcement Emails


Grading


Auditing


Homework policy

Important Note: As we often reuse problem set questions from previous years, covered by papers and webpages, we expect the students not to copy, refer to, or look at the solutions in preparing their answers. Since this is a graduate class, we expect students to want to learn and not google for answers. The purpose of problem sets in this class is to help you think about the material, not just give us the right answers. Therefore, please restrict attention to the books mentioned on the webpage when solving problems on the problem set. If you do happen to use other material, it must be acknowledged clearly with a citation on the submitted solution.


Collaboration policy

Homeworks will be done individually: each student must hand in their own answers. It is acceptable, however, for students to collaborate in figuring out answers and helping each other solve the problems. We will be assuming that, as participants in a graduate course, you will be taking the responsibility to make sure you personally understand the solution to any work arising from such collaboration. You also must indicate on each homework with whom you collaborated. The final project may be completed by small teams.


Late homework policy


Homework regrades policy

If you feel that we have made an error in grading your homework, please turn in your homework with a written explanation to Monica, and we will consider your request. Please note that regrading of a homework may cause your grade to go up or down.


Final project

For project milestone, roughly half of the project work should be completed. A short, graded write-up will be required, and we will provide feedback.

Lecture schedule


Module

Material covered

Class details, online material, and homework

Module 1; Basics
(1 Lectures)
  • What is learning?
    • Version spaces
    • Sample complexity
    • Training set/test set split
  • Point estimation
    • Loss functions
    • MLE
    • Bayesian
    • MAP
    • Bias-Variance trade off
Mon., Jan 15:
** No Class. MLK B-Day **  

Wed., Jan 17:
 
Module 2: Linear models
(3 Lectures)
  • Linear regression [Applet]
    http://www.mste.uiuc.edu/users/exner/java.f/leastsquares/
  • Bias-Variance tradeoff
  • Overfitting
  • Bayes optimal classifier
  • Naive Bayes [Applet]
    http://www.cs.technion.ac.il/~rani/LocBoost/
  • Logistic regression [Applet]
  • Discriminative v. Generative models [Applet]
Mon., Jan. 22:
  • Lecture: Gaussians, Linear Regression, Bias-Variance Tradeoff
    [Slides] [Annotated]
  • Readings: Bishop 1.1 to 1.4, Bishop 3.1, 3.1.1, 3.1.4, 3.1.5, 3.2, 3.3, 3.3.1, 3.3.2
 

Wed., Jan 24:
 

Mon., Jan 29:
 
Module 3: Non-linear models
Model selection
(5 Lectures)
  • Decision trees [Applet]
  • Overfitting, again
  • Regularization
  • MDL
  • Cross-validation
  • Boosting [Adaboost Applet]
    www.cse.ucsd.edu/~yfreund/adaboost
  • Instance-based learning [Applet]
    www.site.uottawa.ca/~gcaron/applets.htm
    • K-nearest neighbors
    • Kernels
  • Neural nets [CMU Course]
    www.cs.cmu.edu/afs/cs/academic/class/15782-s04/
Wed., Jan. 31:
 

Mon., Feb 5:

Wed., Feb. 7:
 

Mon., Feb. 12:
  • Lecture: Cross Validation, Simple Model Selection, Regularization, MDL, Neural Nets
    [Slides] [Annotated]
  • Readings: (Bishop 1.3) Model Selection / Cross Validation
  • (Bishop 3.1.4) Regularized least squares
  • (Bishop 5.1) Feed-forward Network Functions
 

Wed., Feb. 14:
  • Lecture: Neural Nets, Instance-based Learning
    [Slides] [Annotated]
  • Readings: (Bishop 5.1) Feed-forward Network Functions
  • (Bishop 5.2) Network Training
  • (Bishop 5.3) Error Backpropagation
 
Module 4: Margin-based approaches
(2 Lectures)
  • SVMs [Applets]
    www.site.uottawa.ca/~gcaron/applets.htm
  • Kernel trick
Mon., Feb 19:
  • Lecture: Instance-based Learning, SVMs
    [Slides] [Annotated]
  • Readings: (Bishop 2.5) Nonparametric Methods

  • Wed., Feb. 21:
  • Lecture: SVMs
    [Slides] [Annotated]
  • Readings: (Bishop 6.1,6.2) Kernels
  • (Bishop 7.1) Maximum Margin Classifiers
  • Hearst 1998: High Level Presentation
  • Burges 1998: Detailed Tutorial
  • (Optional) Platt 1998: Training SVMs with Sequential Minimal Optimization
  •  
    Module 5: Learning theory
    (3 Lectures)
    • Sample complexity
    • PAC learning [Applets]
      www.site.uottawa.ca/~gcaron/applets.htm
    • Error bounds
    • VC-dimension
    • Margin-based bounds
    • Large-deviation bounds
      • Hoeffding's inequality, Chernoff bound
    • Mistake bounds
    • No Free Lunch theorem
    Mon., Feb. 26:
  • Lecture: SVMs - The Kernel Trick
    [Slides] [Annotated]
  •  

    Wed., Feb. 28
  • Lecture: SVMs - The Kernel Trick, Learning Theory
    [Slides] [Annotated]

  • Mon., Mar. 5
  • Lecture: Learning Theory, Midterm review
    [Slides] [Annotated]
  • Readings: (Mitchell Chapter 7) Computational Learning Theory
  •  

    Mid-term Exam


    All material thus far
    Wed., Mar 7:
     

    Spring break

     

    Mon., Mar. 12:
    ** No class **  

    Wed., Mar. 14:
    ** No class **  
    Module 6: Structured models
    (4 Lectures)

    • HMMs
      • Forwards-Backwards
      • Viterbi
      • Supervised learning
    • Graphical Models
    Mon., Mar. 19:
  • Lecture: Bayes nets - Representation
    [Slides] [Annotated]
  • Readings: (Bishop 8.1,8.2) Bayesian Networks
  •  

    Wed., Mar. 21:
  • Lecture: Bayes nets - Representation (cont.), Inference
    [Slides] [Annotated]

  • Mon., Mar. 26:
  • Lecture: Bayes nets - Inference (cont.),
    HMMs
    [Slides] [Annotated]
  • Readings: (Bishop 8.4.1,8.4.2) - Inference in Chain/Tree Structures
    Rabiner's Detailed HMMs Tutorial

  • Wed., Mar. 28:
  • Lecture: HMMs
    Bayes nets - Structure Learning
    [Slides] [Annotated]
  • Additional Reading: Heckerman BN Learning Tutorial
  • Additional Reading: Tree-Augmented Naive Bayes paper
  •  
    Module 7: Unsupervised
    and  semi-supervised learning
    (6 Lectures)
    • K-means
    • Expectation Maximization (EM)
    • Combining labeled and unlabeled data
      • EM
      • reweighting labeled data
      • Co-training
      • unlabeled data and model selection
    • Dimensionality reduction
    • Feature selection
    Mon., Apr. 2:
  • Lecture: Bayes nets - Structure Learning
    Clustering - K-means & Gaussian mixture models
    [Slides] [Annotated]
    Readings: (Bishop 9.1, 9.2) - K-means, Mixtures of Gaussian
  •  

    Wed., Apr. 4:
  • Lecture: Clustering - K-means & Gaussian mixture models
    [Slides] [Annotated]
  • Readings: Neal and Hinton EM paper

  • Mon., Apr. 9:
  • Lecture: EM
    Baum-Welch (EM for HMMs)
    [Slides] [Annotated]
    Readings: (Bishop 9.3, 9.4) - EM
  •  

    Wed., Apr. 11:
  • Lecture: Baum-Welch (EM for HMMs)
    EM for Bayes Nets
    [Slides] [Annotated]
  • Readings: Ghahramani, "An introduction to HMMs and Bayesian Networks"

  • Mon., Apr. 16:
  • Lecture: EM for Bayes Nets
    Co-Training for semi-supervised learning
    [Slides] [Annotated]
  • Readings: Blum and Mitchell co-training paper
  • Optional reading: Joachims Transductive SVMs paper
  •  

    Wed., Apr. 18:
    Guest lecture by Noah Smith

    Mon., Apr. 23:
  • Lecture: Semi-supervised learning in SVMs
    Principal Component Analysis (PCA)
    [Slides] [Annotated]
  • Reading: Shlens' PCA tutorial
  • Optional reading: Wall et al. 2003 - PCA for gene expression data
  • Module 8: Learning to make decisions
    (3 Lectures)
    • Markov decision processes
    • Reinforcement learning
    Wed., Apr. 25:
  • Lecture: Principal Component Analysis (PCA) (cont.)
    Markov Decision Processes
    [Slides] [Annotated]
  • Reading: Kaelbling et al. Reinforcement Learning tutorial
  •  

    Mon., Apr 30:
  • Lecture: Markov Decision Processes
    Reinforcement Learning
    [Slides] [Annotated]
  • Reading: Brafman and Tennenholtz: Rmax paper
  •  
    Module 9: Advanced topics
    (3 Lectures)
    • Text data
    • Hierarchial Bayesian models
    • Tackling very large datasets
    • Active learning
    • Overview of follow-up classes
    Wed., May 2:
  • Lecture: Reinforcement Learning
    Big Picture
    [Slides] [Annotated]
  •  

    Project Poster Session

     
    Fri., May 4:
    Newell-Simon Hall Atrium
    2:00-5:00pm
       

    Project Paper

     
    Thur., May 10:
    Project paper due
       

    Final Exam

    All material thus far
    Tuesday, May 15th, 1-4 p.m.
    Location: Baker Hall, Room A51
       

    Recitation

    All recitations are Thursdays, 5:30-6:50, Wean Hall 5409, unless otherwise noted.

    Date
    Instructor
    Topic
    Jan. 18 Andy   Review of Probability; Distributions; Bayes Rule
    Jan. 24

    5:30-6:50pm
    NSH 3305
    Brian Introduction to Matlab (code) (obtain Matlab)
    Jan. 25 Jon Naive Bayes Classification [Slides]  
    Feb. 1 Purna Logistic Regression
    Feb. 8 Andy Boosting
    Feb. 15 Brian   Neural Networks  
    Feb. 22 Jon   Support Vector Machines  
    Mar. 1 Jon   The Kernel Trick  
    Mar. 8   **NO RECITATION
    Mar. 15*   **NO RECITATION -- SPRING BREAK
    Mar. 22 Andy Bayes Nets
    Mar. 29 Brian   Hidden Markov Models (Applied to Activity Recognition) [inference notes]
    Apr. 5 Purna  Structure Learning, Chow-Liu  
    Apr. 12 Jon  EM for Gaussian Mixture Models, Spectral Clustering  
    Apr. 19*   ** NO RECITATION -- University Closed
    Apr. 26 Purna    
    May 3 Andy MDPs and Reinforcement Learning RL Sim Applet
    May 10 All TAs Final exam review session


    Exam Schedule


    Additional Resources

    Here are some example questions here for studying for the midterm/final. Note that these are exams from earlier years, and contain some topics that will not appear in this year's final. And some topics will appear this year that do not appear in the following examples.


    Note to people outside CMU

    Feel free to use the slides and materials available online here. Please email the instructors with any corrections or improvements. Additional slides and software are available at the Machine Learning textbook homepage and at Andrew Moore's tutorials page.