William Cohen and
Tom Mitchell
Machine Learning Department
School of Computer Science, Carnegie Mellon University
Machine Learning is concerned with computer programs that automatically improve their performance through experience (e.g., programs that learn to recognize human faces, recommend music and movies, and drive autonomous robots). This course covers the theory and practical algorithms for machine learning from a variety of perspectives. We cover topics such as datamining, Bayesian networks, decision tree learning, neural network learning, statistical learning methods, and reinforcement learning. The course covers theoretical concepts such as inductive bias, the PAC learning framework, Bayesian learning methods, marginbased learning, and Occam's Razor. Short programming assignments include handson experiments with various learning algorithms. Typical assignments include learning to automatically classify email by topic, and learning to automatically classify the mental state of a person from brain image data. Students entering the class with a preexisting working knowledge of probability, statistics and algorithms will be at an advantage, but the class has been designed so that anyone with a strong numerate background can catch up and fully participate. This class is intended for Masters students and advanced undergraduates.
IF YOU ARE ON THE WAIT LIST: come to class anyway the first week. Perhaps others will drop, providing you an opening. This course is now offered every semester.
Note from the instructors: We want you to enjoy the course, work hard, learn a lot, and become as enthusiastic about this material as we are. If you have suggestions about how to improve any aspect of the course during the semester, please let us know!
Class lectures: Mondays & Wednesdays 3:00pm4:20pm, Wean Hall 5309
Review sessions: Thursdays 56pm, Location: NewellSimon Hall 1305 starting on Thursday January 17. TA's will cover material from recent lectures and current homeworks, and will answer your questions. These review sessions are optional (but very helpful!).
Instructors:
Course administrative assistant:
Textbooks:
Homeworks
will be done individually: each student must hand in their own answers.
It is
acceptable, however, for students to collaborate in figuring out
answers and
helping each other solve the problems. If you do collaborate in this
way on a homework, you must indicate on your homework with
whom you
collaborated. We
will be assuming that, as
participants in an upperlevel course, you will be taking the
responsibility to
make sure you personally understand the solution to any work arising
from such
collaboration.
If
you feel that we have made an error in grading your homework, please
turn in
your homework with a written explanation
to Sharon
Cavlovich, and we will consider it. Please note that regrading of a homework may cause your grade
to go up
or down. All regrading requests must be made within one week of when you receive your graded homework.
Date 
Lecture
topic and readings

Lecturer  Homeworks 
Mon Jan 14  Introduction to Machine
Learning Decision tree learning

Mitchell  
Wed Jan 16 
Decision Tree learning, pruning, overfitting, Occam's razor

Mitchell  HW1 out [Data for Q2] 
Mon Jan 21  No class  Martin Luther King Day  
Wed Jan 23  Fast tour of useful
concepts in probability

Cohen  
Mon Jan 28 
Naive Bayes I, Conditional
independence, Bayes rule, Bayesian classifiers

Mitchell  HW1
due HW2 out Data 
Wed Jan 30 
Naive Bayes II, Examples:
classifying text, classifying mental states from brain images

Mitchell  
Mon Feb 4  Perceptrons and linear classifiers

Cohen  
Wed Feb 6 
Logistic Regression

Mitchell  
Mon Feb 11  Logistic regression,
Generative and discriminative classifiers, maximizing conditional data
likelihood, MLE and MAP estimates.

Mitchell  HW2 due HW3 out Data Data Readme 
Wed Feb 13  Evaluation, Statistical
Estimation, Statistical testing, Cross validation estimates of accuracy

Cohen  
Mon Feb 18 
PAC Learning

Mitchell  
Wed Feb 20  Bayes nets, Representation
and Inference

Cohen  HW3 due HW4 out 
Mon Feb 25 
Bayes nets II. Inference and Learning from fully observed data.

Cohen 
HW4
due HW5 out 
Wed Feb 27  Bayes nets III.
Learning from partly unobserved data, EM.

Cohen  
Mon Mar 3  Bayes nets IV. Mixture of
Gaussians. Midterm review.

Mitchell  HW5 due 
Wed Mar 5  ** MIDTERM EXAM ** In class. Open book, open notes, no internet connectivity. 
Midterm solutions  
Mar 10, 12  Spring Break!  
Mon Mar 17  Hidden Markov Models I.

Mitchell  HW6 (Project proposal) out 
Wed Mar 19  Hidden Markov Models II.

Mitchell  
Mon Mar 24  Collaborative Filtering

Cohen  HW6 (Project proposal) due HW 7 (Project progress report) out HW8 out 
Wed Mar 26  Support vector machines and the "kernel trick"

Cohen  
Mon Mar 31  Semisupervised learning I. 
Mitchell  
Wed Apr 2  Semisupervised learning II.

Mitchell  
Mon Apr 7  Dimensionality reduction, feature selection, PCA, etc. 
Mitchell  
Wed Apr 9  Artificial neural networks and supervised dimensionality reduction

Mitchell  
Mon Apr 14  Regression and the biasvariance tradeoff

Cohen  HW 7 (Project progress report) due HW 9 out 
Wed Apr 16  Nearest neighbor methods

Cohen  
Mon Apr 21  Reinforcement learning

Mitchell  HW9 due 
Wed Apr 23  Human and Machine Learning

Mitchell  
Mon Apr 28  Markov Logic Networks, Inductive Logic Programming

Cohen  
Wed April 30  Project poster session: NSH 3305

students!  Project posters due today. Project writeups due 9:00 am Mon May 5 
Tue May 13  Final Exam  The exam will 2.5 hours long, starting at 9:00am 
Course Website (this page):
Note to people outside CMU: Please feel free to reuse any of these course materials that you find of use in your own courses. We ask that you retain any copyright notices, and include written notice indicating the source of any materials you use.