Machine Learning, 10-701 and 15-781

Prof. Carlos Guestrin
School of Computer Science, Carnegie Mellon University

Spring 2007

Class lectures: Mondays & Wednesdays from 10:30-11:50 in Wean Hall 7500

Review sessions: Thursdays 5:30-6:50 in Wean Hall 5409

It is hard to imagine anything more fascinating than automated systems that improve their own performance. The study of learning from data is commercially and scientifically important. This course is designed to give a graduate-level student a thorough grounding in the methodologies, technologies, mathematics and algorithms currently needed by people who do research in learning and data mining or who may need to apply learning or data mining techniques to a target problem. The topics of the course draw from classical statistics, from machine learning, from data mining, from Bayesian statistics and from statistical algorithmics.

Students entering the class should have a pre-existing working knowledge of probability, statistics and algorithms, though the class has been designed to allow students with a strong numerate background to catch up and fully participate.

Instructor

Carlos Guestrin, Wean Hall 5313, x8-3075, guestrin@cs, Office hours: Thursdays 3-4pm

Teaching Assistants

Andy Carlson , Wean 8201, x8-3893, acarlson@cs, Office hours: Mondays 3-5pm
Jonathan Huang , EDSH 200, x8-5576, jch1@cs, Office hours: Tuesdays 1:30-3:30pm
Purna Sarkar , Wean 8402, x8-3076, psarkar@cs, Office hours: Fridays 3-5pm
Brian Ziebart Wean 3707, x8-5942, bziebart@cs, Office hours: Tuesdays 10-noon

Questions

The first point of contact for questions pertaining to homework problems is according to the following schedule. Please contact the TA specific to the homework problem that you have a question about. Also, questions may be emailed to 10701-instructors@cs.cmu.edu.

HMWK #1
Out: 24-Jan
In: 7-Feb
Assignment: [PDF]
Solutions: [PDF]
UCI Breast Cancer dataset [zip]

HMWK #2 Out: 7-Feb In: 21-Feb Assignment [PDF] Solutions: [PDF] Bupa dataset [zip]

HMWK #3
Out: 21-Feb
In: Monday 5-Mar
(no late days allowed)
Assignment: [PDF]
Solutions: [PDF]
libsvm download: [link]
Matlab and data files: [zip]

MIDTERM
Solutions [PDF]

HMWK #4
Out: 28-Mar
In: 11-Apr
Assignment: [PDF]
Solutions: [PDF]
Matlab and data files [zip]

HMWK #5
Out: 11-Apr
In: 25-Apr
Assignment: [PDF]
Matlab and data files [zip]
Solutions: [PDF]

Adminstrative Assistant

Monica Hopes, x8-5527, meh@cs, Wean Hall 4619

Textbooks

Textbook: Pattern Recognition and Machine Learning, Chris Bishop.
Optional textbook: Machine Learning, Tom Mitchell.
Optional textbook: The Elements of Statistical Learning: Data Mining, Inference, and Prediction, Trevor Hastie, Robert Tibshirani, Jerome Friedman.
Optional textbook: Information Theory, Inference, and Learning Algorithms, David Mackay.

Announcement Emails

Class announcements will be broadcasted using a group email list: 10701-announce@cs.cmu.edu
If you are registered for the course, you have automatically been added to the mail group. If you are for some reason NOT receiving these announcements, you can subscribe via the 10701-announce list page.
For changes (incl. additions or removal) to your membership in the course list, please make changes directly via the list administration page [https://mailman.srv.cs.cmu.edu/mailman/listinfo/10701-announce]

Grading

Final grades will be based on midterm (20%), homework (5 assignments, 30%), final project (20%), and final exam (30%).

Auditing

If you are a student, and you don't want to take the class for credit, you must register to audit the class. To satisfy the auditing requirement, you must either:

Do *two* homeworks, and get at least 75% of the points in each; or
Take the final, and get at least 50% of the points; or
Do a class project and do *one* homework, and get at least 75% of the points in the homework;
- Like any class project, it must address a topic related to machine learning and you must have started the project while taking this class (can't be something you did last semester). You will need to submit a project proposal with everyone else, and present a poster with everyone. You don't need to submit a milestone or final paper. You must get at least 80% on the poster presentation part of the project.
Please, send us an email saying that you will be auditing the class and what you plan to do.

If you are not a student and want to sit in the class, please get authorization from the instructor.

Homework policy

Important Note: As we often reuse problem set questions from previous years, covered by papers and webpages, we expect the students not to copy, refer to, or look at the solutions in preparing their answers. Since this is a graduate class, we expect students to want to learn and not google for answers. The purpose of problem sets in this class is to help you think about the material, not just give us the right answers. Therefore, please restrict attention to the books mentioned on the webpage when solving problems on the problem set. If you do happen to use other material, it must be acknowledged clearly with a citation on the submitted solution.

Collaboration policy

Homeworks will be done individually: each student must hand in their own answers. It is acceptable, however, for students to collaborate in figuring out answers and helping each other solve the problems. We will be assuming that, as participants in a graduate course, you will be taking the responsibility to make sure you personally understand the solution to any work arising from such collaboration. You also must indicate on each homework with whom you collaborated. The final project may be completed by small teams.

Late homework policy

You will be allowed 3 total late days without penalty for the entire semester. For instance, you may be late by 1 day on three different homeworks or late by 3 days on one homework. Once those days are used, you will be penalized according to the policy below:
- Homework is worth full credit at the beginning of class on the due date.
- It is worth half credit for the next 48 hours.
- It is worth zero credit after that.
You must turn in all of the 5 homeworks, even if for zero credit, in order to pass the course.
Turn in all late homework assignments to Monica.

Homework regrades policy

If you feel that we have made an error in grading your homework, please turn in your homework with a written explanation to Monica, and we will consider your request. Please note that regrading of a homework may cause your grade to go up or down.

Final project

Project Webpage - Ideas and Details

Project proposal due Wednesday, March 21st

Graded milestone due Monday, April 16th (20% of project grade)

Poster session Friday, May 4th 2:00-5:00pm; NSH Atrium

Paper due by 2pm on Thursday, May 10th (via electronic submission to the list)

For project milestone, roughly half of the project work should be completed. A short, graded write-up will be required, and we will provide feedback.

Lecture schedule

Module

Material covered

Class details, online material, and homework

Module 1; Basics
(1 Lectures)

What is learning?
- Version spaces
- Sample complexity
- Training set/test set split
Point estimation
- Loss functions
- MLE
- Bayesian
- MAP
- Bias-Variance trade off

Mon., Jan 15:
	No Class. MLK B-Day

Wed., Jan 17:
	Lecture: What's ML, Point estimation [Slides] [Annotated] Additional Reference: Andrew Moore's basic probability tutorial Readings: Bishop 2.1

Module 2: Linear models
(3 Lectures)

Linear regression [Applet]
http://www.mste.uiuc.edu/users/exner/java.f/leastsquares/
Bias-Variance tradeoff
Overfitting
Bayes optimal classifier
Naive Bayes [Applet]
http://www.cs.technion.ac.il/~rani/LocBoost/
Logistic regression [Applet]
Discriminative v. Generative models [Applet]

Mon., Jan. 22: Lecture: Gaussians, Linear Regression, Bias-Variance Tradeoff [Slides] [Annotated] Readings: Bishop 1.1 to 1.4, Bishop 3.1, 3.1.1, 3.1.4, 3.1.5, 3.2, 3.3, 3.3.1, 3.3.2

Wed., Jan 24:
	Lecture: Overfitting, What's learning - revisited, Naive Bayes [Slides] [Annotated] Readings: Bishop 1.3, 1.5.5, 3.2, Mitchell's Chapter on Naive Bayes and Logistic Regression (Sections 1 and 2)

Mon., Jan 29:
	Lecture: Naive Bayes, Generative v. Discriminative, Logistic Regression [Slides] [Annotated] Required Reading : Mitchell's Chapter on Naive Bayes and Logistic Regression (All sections) Optional Reading: Ng and Jordan's NIPS 2001 paper on Discriminative versus Generative Learning

Module 3: Non-linear models
Model selection
(5 Lectures)

Decision trees [Applet]
Overfitting, again
Regularization
MDL
Cross-validation
Boosting [Adaboost Applet]
www.cse.ucsd.edu/~yfreund/adaboost
Instance-based learning [Applet]
www.site.uottawa.ca/~gcaron/applets.htm
- K-nearest neighbors
- Kernels
Neural nets [CMU Course]
www.cs.cmu.edu/afs/cs/academic/class/15782-s04/

Wed., Jan. 31:
	Lecture: Logistic Regression, Decision Trees [Slides] [Annotated] Readings: (Bishop - 14.4) Tree-based Models Recommended Reading: Nils Nilsson's Chapter (All Sections): Decision Trees Optional Review of Boolean Logic/DNF: Nils Nilsson's Chapter Boolean Functions (first 4 pages)

Mon., Feb 5:
	Lecture: Decision Trees [Slides] [Annotated] Readings: (Bishop - 14.4) Tree-based Models Recommended Reading: Nils Nilsson's Chapter (All Sections): Decision Trees Optional Review of Boolean Logic/DNF: Nils Nilsson's Chapter Boolean Functions (first 4 pages)

Wed., Feb. 7:
	Lecture: Boosting, Cross Validation, Simple Model Selection, Regularization, MDL [Slides] [Annotated] Readings: (Bishop 14.3) Boosting Schapire Boosting tutorial (Bishop 1.3) Model Selection/ Cross Validation

Mon., Feb. 12:
	Lecture: Cross Validation, Simple Model Selection, Regularization, MDL, Neural Nets [Slides] [Annotated] Readings: (Bishop 1.3) Model Selection / Cross Validation (Bishop 3.1.4) Regularized least squares (Bishop 5.1) Feed-forward Network Functions

Wed., Feb. 14:
	Lecture: Neural Nets, Instance-based Learning [Slides] [Annotated] Readings: (Bishop 5.1) Feed-forward Network Functions (Bishop 5.2) Network Training (Bishop 5.3) Error Backpropagation

Module 4: Margin-based approaches
(2 Lectures)

SVMs [Applets]
www.site.uottawa.ca/~gcaron/applets.htm
Kernel trick

Mon., Feb 19:
	Lecture: Instance-based Learning, SVMs [Slides] [Annotated] Readings: (Bishop 2.5) Nonparametric Methods

Wed., Feb. 21:
	Lecture: SVMs [Slides] [Annotated] Readings: (Bishop 6.1,6.2) Kernels (Bishop 7.1) Maximum Margin Classifiers Hearst 1998: High Level Presentation Burges 1998: Detailed Tutorial (Optional) Platt 1998: Training SVMs with Sequential Minimal Optimization

Module 5: Learning theory
(3 Lectures)

Sample complexity
PAC learning [Applets]
www.site.uottawa.ca/~gcaron/applets.htm
Error bounds
VC-dimension
Margin-based bounds
Large-deviation bounds
- Hoeffding's inequality, Chernoff bound
Mistake bounds
No Free Lunch theorem

Mon., Feb. 26:
	Lecture: SVMs - The Kernel Trick [Slides] [Annotated]

Wed., Feb. 28
	Lecture: SVMs - The Kernel Trick, Learning Theory [Slides] [Annotated]

Mon., Mar. 5
	Lecture: Learning Theory, Midterm review [Slides] [Annotated] Readings: (Mitchell Chapter 7) Computational Learning Theory

Mid-term Exam

All material thus far

Wed., Mar 7:

Spring break

Mon., Mar. 12:
	No class

Wed., Mar. 14:
	No class

Module 6: Structured models
(4 Lectures)

HMMs
- Forwards-Backwards
- Viterbi
- Supervised learning
Graphical Models
- Applet: Java Bayes
- Representation
- Inference
- Learning
- BIC

Mon., Mar. 19:
	Lecture: Bayes nets - Representation [Slides] [Annotated] Readings: (Bishop 8.1,8.2) Bayesian Networks

Wed., Mar. 21:
	Lecture: Bayes nets - Representation (cont.), Inference [Slides] [Annotated]

Mon., Mar. 26:
	Lecture: Bayes nets - Inference (cont.), HMMs [Slides] [Annotated] Readings: (Bishop 8.4.1,8.4.2) - Inference in Chain/Tree Structures Rabiner's Detailed HMMs Tutorial

Wed., Mar. 28:
	Lecture: HMMs Bayes nets - Structure Learning [Slides] [Annotated] Additional Reading: Heckerman BN Learning Tutorial Additional Reading: Tree-Augmented Naive Bayes paper

Module 7: Unsupervised
and semi-supervised learning
(6 Lectures)

K-means

Applet: K-means

Expectation Maximization (EM)

mixture of Gaussians

Applet: Mixture of Gaussians

for training Bayes nets
for training HMMs

Combining labeled and unlabeled data

EM
reweighting labeled data
Co-training
unlabeled data and model selection

Dimensionality reduction

PCA, SVD

Applet: PCA

Feature selection

Mon., Apr. 2:
	Lecture: Bayes nets - Structure Learning Clustering - K-means & Gaussian mixture models [Slides] [Annotated] Readings: (Bishop 9.1, 9.2) - K-means, Mixtures of Gaussian

Wed., Apr. 4:
	Lecture: Clustering - K-means & Gaussian mixture models [Slides] [Annotated] Readings: Neal and Hinton EM paper

Mon., Apr. 9:
	Lecture: EM Baum-Welch (EM for HMMs) [Slides] [Annotated] Readings: (Bishop 9.3, 9.4) - EM

Wed., Apr. 11:
	Lecture: Baum-Welch (EM for HMMs) EM for Bayes Nets [Slides] [Annotated] Readings: Ghahramani, "An introduction to HMMs and Bayesian Networks"

Mon., Apr. 16:
	Lecture: EM for Bayes Nets Co-Training for semi-supervised learning [Slides] [Annotated] Readings: Blum and Mitchell co-training paper Optional reading: Joachims Transductive SVMs paper

Wed., Apr. 18:
	Guest lecture by Noah Smith

Mon., Apr. 23:
	Lecture: Semi-supervised learning in SVMs Principal Component Analysis (PCA) [Slides] [Annotated] Reading: Shlens' PCA tutorial Optional reading: Wall et al. 2003 - PCA for gene expression data

Module 8: Learning to make decisions
(3 Lectures)

Markov decision processes
Reinforcement learning

Wed., Apr. 25:
	Lecture: Principal Component Analysis (PCA) (cont.) Markov Decision Processes [Slides] [Annotated] Reading: Kaelbling et al. Reinforcement Learning tutorial

Mon., Apr 30:
	Lecture: Markov Decision Processes Reinforcement Learning [Slides] [Annotated] Reading: Brafman and Tennenholtz: Rmax paper

Module 9: Advanced topics
(3 Lectures)

Text data
Hierarchial Bayesian models
Tackling very large datasets
Active learning
Overview of follow-up classes

Wed., May 2:
	Lecture: Reinforcement Learning Big Picture [Slides] [Annotated]

Project Poster Session

Fri., May 4: Newell-Simon Hall Atrium 2:00-5:00pm

Project Paper

Thur., May 10: Project paper due

Final Exam

All material thus far

Tuesday, May 15th, 1-4 p.m. Location: Baker Hall, Room A51

Recitation

All recitations are Thursdays, 5:30-6:50, Wean Hall 5409, unless otherwise noted.

Date	Instructor	Topic
Jan. 18	Andy	Review of Probability; Distributions; Bayes Rule
Jan. 24 5:30-6:50pm NSH 3305	Brian	Introduction to Matlab (code) (obtain Matlab)
Jan. 25	Jon	Naive Bayes Classification [Slides]
Feb. 1	Purna	Logistic Regression
Feb. 8	Andy	Boosting
Feb. 15	Brian	Neural Networks
Feb. 22	Jon	Support Vector Machines
Mar. 1	Jon	The Kernel Trick
Mar. 8		**NO RECITATION
Mar. 15*		**NO RECITATION -- SPRING BREAK
Mar. 22	Andy	Bayes Nets
Mar. 29	Brian	Hidden Markov Models (Applied to Activity Recognition) [inference notes]
Apr. 5	Purna	Structure Learning, Chow-Liu
Apr. 12	Jon	EM for Gaussian Mixture Models, Spectral Clustering
Apr. 19*		** NO RECITATION -- University Closed
Apr. 26	Purna
May 3	Andy	MDPs and Reinforcement Learning RL Sim Applet
May 10	All TAs	Final exam review session

Exam Schedule

Note: both the midterm and the final will be open book, notes, papers, etc., but you are not allowed to use a computer.

Midterm: Wednesday, Mar. 7, 10:30-11:50 am, Wean Hall 7500 (in class).

Final: Tuesday, May 15th, 1-4 p.m., location TBA

Additional Resources

Here are some example questions here for studying for the midterm/final. Note that these are exams from earlier years, and contain some topics that will not appear in this year's final. And some topics will appear this year that do not appear in the following examples.

DTREE slides with annotations from review session on Monday

The 2001 midterm exam

The 2001 midterm solutions

The 2001 final exam

Solutions to the 2001 final exam

The 2002 midterm exam

Solutions to the 2002 midterm exam

Additional examples of midtermlike questions (PS) or (PDF)

Solutions to the additional examples (PS) or (PDF)

Final Exam 2002 (some figs missing)

Answers for final Exam 2002

Andrew's handwritten answers for final Exam 2002

The 2003 midterm exam

Handwritten annotations to 2003 midterm by Ajit on Thursday evening

Solutions to the 2003 midterm exam

Final Exam 2003

Answers for final Exam 2003

The 2004 midterm exam

Solutions to the 2004 midterm exam

Solutions to the 2005 midterm exam

The Spring 2006 midterm exam

Partial Solutions to the Spring 2006 midterm exam

The Fall 2006 midterm exam

Solutions to the Fall 2006 midterm exam

The Fall 2006 final exam

Solutions to the Fall 2006 final exam

The Spring 2006 final exam

Solutions to the Spring 2006 final exam

Spring 2005 course webpage

Fall 2005 course webpage

Note to people outside CMU

Feel free to use the slides and materials available online here. Please email the instructors with any corrections or improvements. Additional slides and software are available at the Machine Learning textbook homepage and at Andrew Moore's tutorials page.