Introduction to Machine Learning (PhD)
Spring 2019, CMU 10701


Lectures: MW, 10:30-11:50pm, Rashid Autorium: 4401 Gates and Hillman Center (GHC)
Recitations: F, 10:30-11:50pm, Rashid Autorium: 4401 Gates and Hillman Center (GHC)

Instructors:

Assistant Instructor: Brynn Edmunds


Teaching Assistants:
Byungsoo Jeon
Fabricio Flores
Gi Bum Kim
Jinke Liu
Mauro Moretto
Yimeng Zhang
Ziheng Cai


Communication: Piazza will be used for discussion about the course and assignments.

Office Hours:
  • Leila: Tuesday 2-3pm, 8217 GHC
  • Byungsoo Jeon: Thursday 9-10am, GHC 6th floor collaborative space
  • Gi Bum Kim: Monday 6-7pm, GHC 6th floor collaborative space
  • Jinke Liu: Friday 9-10am, GHC 6th floor collaborative space
  • Mauro Moretto: Tuesday 9-10am, 6th floor collaborative space
  • Yimeng Zhang: Tuesday 4-5pm, GHC 6th floor collaborative space
  • Ziheng Cai: Friday 4-5 pm, GHC 6th floor collaborative space

Course Description

Machine learning studies the question "How can we build computer programs that automatically improve their performance through experience?" This includes learning to perform many types of tasks based on many types of experience. For example, it includes robots learning to better navigate based on experience gained by roaming their environments, medical decision aids that learn to predict which therapies work best for which diseases based on data mining of historical health records, and speech recognition systems that learn to better understand your speech based on experience listening to you.

This course is designed to give PhD students a thorough grounding in the methods, mathematics and algorithms needed to do research and applications in machine learning. Students entering the class with a pre-existing working knowledge of probability, statistics and algorithms will be at an advantage, but the class has been designed so that anyone with a strong numerate background can catch up and fully participate.

If you are interested in this topic, but are not a PhD student, or are a PhD student not specializing in machine learning, you might consider the master's level course on Machine Learning, 10-601. 10-601 may be appropriate for MS and undergrad students who are interested in the theory and algorithms behind ML. You can evaluate your ability to take 10-701 via a self-assessment exam here and see an ML course comparison here.

Prerequisites

Students entering the class are expected to have a pre-existing working knowledge of probability, linear algebra, statistics and algorithms, though the class has been designed to allow students with a strong numerate background to catch up and fully participate. In addition, recitation sessions will be held to revise some basic concepts.

Resources

Supplementary readings:
  1. [KM] Machine Learning: A probabilistic perspective, Kevin Murphy.
  2. [CB] Pattern Recognition and Machine Learning, Christopher Bishop.
  3. [HTF] The Elements of Statistical Learning: Data Mining, Inference and Prediction, Trevor Hastie, Robert Tibshirani, Jerome Friedman.
  4. [TM] Machine Learning, Tom Mitchell.

Schedule

Tentative schedule, might change according to class progress and interest. Every Friday classes is intended to be a recitation to review material or answer homework questions, however this might change if we need a makeup lecture.

Date Note Topic     Resources

Basics

01/14 Lecture 1: Introduction - What is Machine Learning - slides, notes     [CB] Chapter 2.1, Appendix B
    [KM] Chap 1
01/16 HW1 Out Lecture 2: Building blocks - MLE, MAP, Probability review - notes     [KM] Chap 2
    [HTF] Chap 8
01/18 Special Office Hours
01/21 MLK day, no class
01/23 Lecture 3: Classification, Bayes Decision Rule, kNN - notes [KM] Chap 1, [CB] 1.5, Hal Daume III's Book Chapter 2 and Chapter 3
Manual Construction of Voronoi Diagram
KNN Applet
01/25 Recitation

Parametric Estimation and Prediction

01/28 Lecture 4: Linear Regression, Regularization - notes [HTF] Chapter 3
[KM] 7.1 to 7.5
[CB] 3.1, 3.2
Hal Daume III's Book Chapter 2
01/30 HW1 due, HW2 Out Lecture 5: Canceled, weather reasons
02/01 Recitation
02/04 Lecture 5: Naive Bayes, Logistic Regression, Disciminative vs generative - slides Tom Mitchell's Generative and Discriminative Classifiers Chapter
[TM] 6.1 to 6.10
[CB] 4.2, 4.3
02/06 Lecture 6: Naive Bayes, Logistic Regression, Disciminative vs generative - slides Tom Mitchell's Generative and Discriminative Classifiers Chapter
[TM] 6.1 to 6.10
[CB] 4.2, 4.3
[KM] 1.4.6
Ng and Jordan 2002
02/08 Recitation
02/11 Project Topic Selection Lecture 7: Decision Trees - slides [TM] Chapter 3
[CB] 1.6, 14.4
02/13 HW2 due, HW3 Out Lecture 8: Neural Networks (perceptron, neural nets) - slides [TM] Chapter 3
[CB] 1.6, 14.4
Hal Daume III's Book Chapter 4
02/15 Special Office Hours - Rashid Auditorium
02/18 Lecture 9: Neural Networks (deep nets, backprop) - slides [TM] Ch. 4
[CB] Ch. 5
Hal Daume III's Book Chapter 10
02/20 Lecture 10: SVMs - slides Andrew Ng's lecture notes
02/22 Recitation
02/25 Course Drop Deadline Lecture 11: SVMs - slides Andrew Ng's lecture notes
02/27 HW3 due Lecture 12: SVMs - Boosting - slides Andrew Ng's lecture notes
Rob Shapire’s 2001 paper on Boosting
03/01 Recitation

Learning Theory

03/04 Lecture 13: Learning Theory- slides [TM] Chapter 7
Nina Balcan’s notes on generalization guarantees
03/06 Midway Report Due Lecture 14: Learning Theory- slides [TM] Chapter 7
Nina Balcan’s notes on generalization guarantees
03/08 Mid-Semester Break, no class
03/11 Spring break, no class
03/13 Spring break, no class
03/15 Spring break, no class

Unsupervised Learning

03/18 Guest Lecturer Matt Gormley
Midterm Review - slides
03/20 Guest Lecturer Matt Gormley
Lecture 17: K-means slides
[HTF] Ch. 14.1-14.3
Hal Daume III's Book Chapter 15
03/21

Midterm Exam (Thursday 3/21 6:30pm)

03/22 Recitation
03/25 HW4 Out Guest Lecturer Ameet Talwalkar
Lecture 18: EM and Gaussian Mixture Models slides
[CB] Ch. 9
Hal Daume III's Book Chapter 16

Graphical Models and Structured Prediction

03/27 Guest Lecturer Tom Mitchell
Lecture 19: Graphical Models slides
[CB] Ch. 8-8.2
03/29 Recitation
04/01 Guest Lecturer Tom Mitchell
Lecture 20: Graphical Models slides
[CB] Ch. 8
04/03 Lecturer Aaditya Ramdas
Lecture 21: HMMs - notes from Fall 2017 11-711
04/05 Recitation

Unsupervised Learning (continued)

04/08 HW4 Due Lecturer Aaditya Ramdas
Lecture 22: SVD and PCA - slides - Andrew Ng's Notes
Cleve Moler's chapter on eigenvalues
Aaditya Ramdas's SVD review videos
04/10 Lecturer Aaditya Ramdas
Lecture 23: SVD and PCA - slides - Andrew Ng's Notes
Cleve Moler's chapter on eigenvalues
Aaditya Ramdas's SVD review videos
04/12 Spring Carnival, No Class

Special Topics

04/15 Lecturer Aaditya Ramdas
Lecture 24: Reinforcement Learning - Andrew Ng's Notes
04/17

Projects Poster Session (Wednesday 4/17 10am-11:30am and 12pm-1:30pm)

04/19 Recitation
04/22 Lecturer Aaditya Ramdas
Lecture 25: Reinforcement learning - Andrew Ng's Notes
04/24 Lecturer Aaditya Ramdas
Lecture 26: Comparing Classifiers
Jose Lozano's slides
Janez Demsar's paper
Katarzyna Stąpor's chapter
04/26 Recitation
04/29 Lecturer Aaditya Ramdas
Lecture 27: Sources of bias in applied ML
Jose Lozano's slides
Janez Demsar's paper
Katarzyna Stąpor's chapter
05/01 Final Reports Due Lecture 28: Review, discussion slides
05/03 Last day of class Recitation

Grading

The final grade will be determined as follows:
Exam 1 15%
Exam 2 15%
Homework 40%
Project 25%
Participation 5%

Course Policies

(The following policies are adapted from and Ziv Bar-Joseph and Pradeep Ravikumar 10-701 Fall 2018 and Roni Rosenfeld's 10-601 Spring 2016 Course Policies.)

Homework

There will be four homework assignments, each worth 10% of the final grade. Answers will be submitted on Gradescope (through canvas) and code portions through Autolab.

Projects

You may work in teams of 3-5 people. There will be a limited number of project to choose from. You will not be able to chose other projects. There will be two deliverables (project proposal and final report). Each team member's contribution should be highlighted. You should use the project as an opportunity to "learn by doing".

Extensions

You will have 8 late days (for the entire semesters) that you can use for homework submission. You can chose to divide the days up on the homework assignments the way you want, with the constraint that you cannot use more than 4 days for any given homework. After you exceed your allowed days, you will lose all of the points for a late homework. You should use these dates if you need extensions for various deadlines (conferences, interviews etc.). If for some reason you need extension for some other reason, you should talk to us at least five days in advance. In the case of a true emergency that cannot be predicted (sickness, family problems etc.) we can give a reasonable extension. You cannot use late days for the projects.

Academic Integrity

Collaboration among Students

Collaboration among students is allowed but is intended to help you learn better. You can work on solving assignments together, but you should always write up your solutions separately. You should always implement code alone as well. Whenever collaboration happens, it should be reported by all parties involved in the relevant homework problem.

Online Resources

Some of the homework question you receive might have solutions online. Looking up the answers to these homework questions is not allowed. Similarly, looking up code for a problem is not allowed. Sometimes, you might need help with a small portion of the code (for example array indexing), such basic things that are not in relation to understanding the material are allowed. Do use office hours if you have questions about homework problems.

Disclosing

Whenever you collaborate with someone or you look at online material, you should disclose it in your homework. When in doubt, always disclose. Any breach of academic integrity policies will be reported to the university authorities (your Department Head, Associate Dean, Dean of Student Affairs, etc.) as an official Academic Integrity Violation and will carry severe penalties. It is truly both an unethical and a bad strategy.

Accommodations for Students with Disabilities

If you have a disability and are registered with the Office of Disability Resources, I encourage you to use their online system to notify me of your accommodations and discuss your needs with me as early in the semester as possible. I will work with you to ensure that accommodations are provided as appropriate. If you suspect that you may have a disability and would benefit from accommodations but are not yet registered with the Office of Disability Resources, I encourage you to contact them at access@andrew.cmu.edu..

Take care of yourself

Take care of yourself. Do your best to maintain a healthy lifestyle this semester by eating well, exercising, avoiding drugs and alcohol, getting enough sleep and taking some time to relax. This will help you achieve your goals and cope with stress.

All of us benefit from support during times of struggle. There are many helpful resources available on campus and an important part of the college experience is learning how to ask for help. Asking for support sooner rather than later is almost always helpful.

If you or anyone you know experiences any academic stress, difficult life events, or feelings like anxiety or depression, we strongly encourage you to seek support. Counseling and Psychological Services (CaPS) is here to help: call 412-268-2922 and visit their website at http://www.cmu.edu/counseling/. Consider reaching out to a friend, faculty or family member you trust for help getting connected to the support that can help.

If you or someone you know is feeling suicidal or in danger of self-harm, call someone immediately, day or night:
  • CaPS: 412-268-2922
  • Re:solve Crisis Network: 888-796-8226

If the situation is life threatening, call the police
  • On campus: CMU Police: 412-268-2323
  • Off campus: 911