Machine Learning, 10-601 

William Cohen and Tom Mitchell
Machine Learning Department
School of Computer Science, Carnegie Mellon University

Spring 2008


Machine Learning is concerned with computer programs that automatically improve their performance through experience (e.g., programs that learn to recognize human faces, recommend music and movies, and drive autonomous robots). This course covers the theory and practical algorithms for machine learning from a variety of perspectives. We cover topics such as datamining, Bayesian networks, decision tree learning, neural network learning, statistical learning methods, and reinforcement learning. The course covers theoretical concepts such as inductive bias, the PAC learning framework, Bayesian learning methods, margin-based learning, and Occam's Razor. Short programming assignments include hands-on experiments with various learning algorithms. Typical assignments include learning to automatically classify email by topic, and learning to automatically classify the mental state of a person from brain image data.  Students entering the class with a pre-existing working knowledge of probability, statistics and algorithms will be at an advantage, but the class has been designed so that anyone with a strong numerate background can catch up and fully participate. This class is intended for Masters students and advanced undergraduates.

IF YOU ARE ON THE WAIT LIST:  come to class anyway the first week.  Perhaps others will drop, providing you an opening.  This course is now offered every semester.

Note from the instructors: We want you to enjoy the course, work hard, learn a lot, and become as enthusiastic about this material as we are.  If you have suggestions about how to improve any aspect of the course during the semester, please let us know!

Class lectures: Mondays & Wednesdays 3:00pm-4:20pm, Wean Hall 5309 

Review sessions: Thursdays 5-6pm, Location: Newell-Simon Hall 1305 starting on Thursday January 17.    TA's will cover material from recent lectures and current homeworks, and will answer your questions.  These review sessions are optional (but very helpful!).  


Teaching Assistants:

Course administrative assistant:


Announcement emails:
Newsgroup - Bulletin Board: academic.cs.10601 Final grades: Homework policy: 
1. Collaborating on homeworks:
2. To cover the possibility that a dog may eat your homework, we will automatically drop your lowest homework grade when calculating your final grade.  

3. Late homeworks:
4. Regrading homeworks
Course projects: see the course project page.  Proposals are due March 25


Tentative lecture schedule:


Lecture topic and readings

 Lecturer  Homeworks
Mon Jan 14  Introduction to Machine Learning
 Decision tree learning

Wed Jan 16

 Decision Tree learning, pruning, overfitting, Occam's razor

Mitchell HW1 out
[Data for Q2]
Mon Jan 21  No class -- Martin Luther King Day
Wed Jan 23   Fast tour of useful concepts in probability  Cohen

Mon Jan 28

 Naive Bayes I, Conditional independence, Bayes rule, Bayesian classifiers
Mitchell HW1 due
HW2 out
Wed Jan 30
 Naive Bayes II, Examples: classifying text, classifying mental states from brain images
 Mon Feb 4  Perceptrons and linear classifiers

Wed Feb 6

 Logistic Regression
Mon Feb 11  Logistic regression, Generative and discriminative classifiers, maximizing conditional data likelihood,  MLE and MAP estimates.
 Mitchell HW2 due
HW3 out
Data Readme

Wed Feb 13  Evaluation, Statistical Estimation, Statistical testing, Cross validation estimates of accuracy
  • Lecture slides: Evaluation
  • Required reading: Machine Learning  Chapter 5
Mon Feb 18
 PAC Learning
Wed Feb 20   Bayes nets, Representation and Inference Cohen HW3 due
HW4 out
Mon Feb 25
  Bayes nets II. Inference and Learning from fully observed data. Cohen
HW4 due
HW5 out 
Wed Feb 27   Bayes nets III.  Learning from partly unobserved data, EM.  Cohen
Mon Mar 3   Bayes nets IV. Mixture of Gaussians.   Midterm review.
Mitchell HW5 due
Wed Mar 5    ** MIDTERM EXAM **
  In class.  Open book, open notes, no internet connectivity.

Midterm solutions
  Mar 10, 12   Spring Break! 
  Mon Mar 17   Hidden Markov Models I.
  • Slides -- see March 19
  Mitchell   HW6 (Project proposal) out
  Wed Mar 19   Hidden Markov Models II.
  Mon Mar 24   Collaborative Filtering   Cohen   HW6 (Project proposal) due
HW 7 (Project progress report) out
HW8 out
  Wed Mar 26   Support vector machines and the "kernel trick"   Cohen
  Mon Mar 31   Semi-supervised learning I.
  Wed Apr 2   Semi-supervised learning II.
  Mon Apr 7   Dimensionality reduction, feature selection, PCA, etc.
  Wed Apr 9   Artificial neural networks and supervised dimensionality reduction
  Mon Apr 14   Regression and the bias-variance tradeoff
  • Slides
  • Readings: Bishop 3.1, 3.2
  Cohen   HW 7 (Project progress report) due
HW 9 out
  Wed Apr 16   Nearest neighbor methods   Cohen
  Mon Apr 21   Reinforcement learning  Mitchell  HW9 due
  Wed Apr 23  Human and Machine Learning
  Mon Apr 28  Markov Logic Networks, Inductive Logic Programming  Cohen  
  Wed April 30   Project poster session: NSH 3305
  • NSH 3305 will be open at 2:30 so you can set up. The poster session will start at 3:00 (the usual start time) and will end by 5:00 .
  • Recitation May 1 in different location: NSH 3002
  students!   Project posters due today.

Project  writeups due May 2
9:00 am Mon May 5
 Tue May 13  
 Final Exam - The exam will 2.5 hours long, starting at 9:00am

Web pages for earlier versions of this course:  (include examples of midterms, homework questions, ...) 

Course Website (this page):

Note to people outside CMU:  Please feel free to reuse any of these course materials that you find of use in your own courses.  We ask that you retain any copyright notices, and include written notice indicating the source of any materials you use.