10701 Introduction to Machine Learning

Course description

Machine learning studies the question "how can we build computer programs that automatically improve their performance through experience?" This includes learning to perform many types of tasks based on many types of experience. For example, it includes robots learning to better navigate based on experience gained by roaming their environments, medical decision aids that learn to predict which therapies work best for which diseases based on data mining of historical health records, and speech recognition systems that lean to better understand your speech based on experience listening to you.This course is designed to give PhD students a thorough grounding in the methods, theory, mathematics and algorithms needed to do research and applications in machine learning. The topics of the course draw from from machine learning, from classical statistics, from data mining, from Bayesian statistics and from information theory. Students entering the class with a pre-existing working knowledge of probability, statistics and algorithms will be at an advantage, but the class has been designed so that anyone with a strong numerate background can catch up and fully participate.

Textbook

Machine Learning: a Probabilistic Perspective, Kevin Murphy
Pattern Recognition and Machine Learning, Chris Bishop, ISBN
Machine Learning, Tom Mitchell.
Information Theory, Inference, and Learning Algorithms , David Mackay. Note: This book is available online as a free PDF here.
Additional readings will be made available as appropriate.

Grading

The requirements of this course consist of participating in lectures, midterm, 5 problem sets and a project. This is a PhD level class, and the most important thing for us is that by the end of this class students understand the basic methodologies in machine learning, and be able to use them to solve real problems of modest complexity. The grading breakdown is the following:

Problem sets (5 assignments, 40%)
Midterm (25%)
Final project (35%)

Homework resources and collaboration policy

Homeworks and exams may contain material that has been covered by papers and webpages. Since this is a graduate class, we expect students to want to learn and not google for answers. You should cite the materials you used.

Homeworks will be done individually: each student must hand in their own answers. It is acceptable, however, for students to collaborate in figuring out answers and helping each other solve the problems. We will be assuming that, as participants in a graduate course, you will be taking the responsibility to make sure you personally understand the solution to any work arising from such collaboration. You also must indicate on each homework with whom you collaborated.

The final project may be completed by small teams.

Late homework policy

You will be allowed 2 total late days without penalty for the entire semester. You may be late by 1 day on two different homeworks or late by 2 days on one homework. Weekends and holidays are also counted as late days. Late submissions are automatically considered as using late days. Once those days are used, you will be penalized according to the following policy:

Homework is worth full credit before the deadline.
It is worth half credit for the next 24 hours.
It is worth zero credit after that.

You must turn in at least n-1 of the n homeworks, even if for zero credit, in order to pass the course.

Note to people outside CMU

Please feel free to reuse any of these course materials that you find of use in your own courses. We ask that you retain any copyright notices, and include written notice indicating the source of any materials you use.