10-315, Spring 2020

Introduction to Maching Learning

Overview

Key Information

Monday + Wednesday, 12:00 pm - 1:20 pm, Posner A35

Friday 12:00 pm - 12:50 pm, Posner A35, see Recitation

Alex Singh, Annie Hu, George Brown, Haoran Fei, Michelle Ma, Nidhi Jain, Vicky Zeng, see the 315 Staff page

Grades will be collected in Canvas.
Midterm 20%, Final 30%, Written/Programming homework 40%, Online homework 5%, Participation 5%

There is no required textbook for this course. Any recommended readings will come from sources freely available online.

We will use Piazza for questions and any course announcements.

Students will turn in their homework electronically using Gradescope.

Machine Learning is concerned with computer programs that automatically improve their performance through experience (e.g., programs that learn to recognize human faces, recommend music and movies, and drive autonomous robots). This course covers the core concepts, theory, algorithms and applications of machine learning. We cover supervised learning topics such as classification (Naive Bayes, Logistic regression, Support Vector Machines, neural networks, k-NN, decision trees, boosting) and regression (linear, nonlinear, kernel, nonparametric), as well as unsupervised learning (density estimation, clustering, PCA, dimensionality reduction).

Learning Objectives

After completing the course, students should be able to:

  • Select and apply an appropriate supervised learning algorithm for classification problems (e.g., naive Bayes, support vector machine, logistic regression, neural networks).
  • Select and apply an appropriate supervised learning algorithm for regression problems (e.g. linear regression, ridge regression, nonparametric kernel regression).
  • Recognize different types of unsupervised learning problems, and select and apply appropriate algorithms (e.g., clustering, linear and nonlinear dimensionality reduction).
  • Work with probability (Bayes rule, conditioning, expectations, independence), linear algebra (vector and matrix operations, eigenvectors, SVD), and calculus (gradients, Jacobians) to derive machine learning methods such as linear regression, naive Bayes, and principal component analysis.
  • Understand machine learning principles such as model selection, overfitting, and underfitting, and techniques such as cross-validation and regularization.
  • Implement machine learning algorithms such as logistic regression via stochastic gradient descent, linear regression, or k-means clustering.
  • Run appropriate supervised and unsupervised learning algorithms on real and synthetic data sets and interpret the results.

Levels

This course is designed for SCS undergraduate majors. It covers many similar topics to other introductory machine learning course, such as 10-301/10-601 and 10-701. Contact the instructor if you are concerned about which machine learning course is appropriate for you.

Prerequisites + Corequisites

The prequisites for this course are:

  • 15-122: Principles of Imperative Computation
  • 36-225 or 36-218 or 36-217 or 15-359 or 21-325 or 36-219: Probability
  • 15-151 or 21-127 or 21-128: Mathematical Foundations of Computer Science / Concepts of Mathematics.

Notably missing in this prerequisite list is any linear algebra course. Linear algebra is indeed a central piece to this machine learning course. Given the lack of a linear algebra prequisite, we will provide the necessary resources and instruction for linear algebra. That being said, if you have never been exposed to matricies and vectors in any context, please contact the instructor to discuss how to best meet your linear algebra needs.

Please see the instructor if you are unsure whether your background is suitable for the course.

Office Hours

Feel free to contact any of the course staff to request office hours by appointment. We'll do our best to accommadate these requests. Also, check the office hours appointment calender, as we occassionally explicitly place appointment slots here.

When appropriate, this course uses the CMU OHQueue tool as a queueing system for office hours.

Schedule (subject to change)

Textbooks:

Bishop, Christopher. Pattern Recognition and Machine Learning, available online, (optional)

Murphy, Kevin P. Machine Learning: A Probabilistic Perspective, available online, (optional)

Goodfellow, Ian, Yoshua Bengio, Aaron Courville. Deep Learning, available online, (optional)

Shaw-Taylor, John, Nello Cristianini. Kernel Methods for Pattern Analysis, available online, (optional)

Dates Topic Reading / Demo Slides / Notes
1/13 Mon Introduction to Classification, Regression, and ML Concepts pptx (inked) pdf (inked)
1/15 Wed Introduction to Classification, Regression, and ML Concepts lec2.ipynb notation.pdf handout.pdf whiteboard.pdf
1/20 Mon No class: MLK Day
1/22 Wed Linear Regression Murphy 7.3.1 whiteboard.pdf
1/27 Mon Probabilistic Linear Regression Bishop 1.2.4-5, 3.1.1-2
MRI.pptx (pdf)
whiteboard.pdf
1/29 Wed Logistic Regression Bishop 4.1.3, 4.3.2
lec5.ipynb
pptx (inked) pdf (inked)
2/3 Mon Regularization pptx (inked) pdf (inked)
2/5 Wed Regularization Bishop 3.1.4, Murphy 7.5 pptx (inked) pdf (inked)
2/10 Mon Naive Bayes Murphy 3.5 pptx (inked) pdf (inked)
handout.pdf (sol, sol_add1)
2/12 Wed Generative Models Murphy 4.2, 8.6 pptx (inked) pdf (inked)
2/17 Mon Neural Networks pptx (inked) pdf (inked)
2/19 Wed Neural Networks Goodfellow, et al, Ch. 6 pptx (inked) pdf (inked)
2/24 Mon Neural Networks Goodfellow, et al, Ch. 9 pptx (inked) pdf (inked)
2/26 Wed Nearest Neighbor Murphy 1.4, Bishop 2.5 pptx (inked) pdf (inked)
3/2 Mon MIDTERM EXAM 5-6:30 pm, Location DH 1212 and GHC 4401
3/4 Wed Decision Trees pptx (inked) pdf (inked)
3/9 Mon No class: Spring Break
3/11 Wed No class: Spring Break
3/16 Mon No class: COVID-19
3/18 Wed Decision Trees Murphy 16.2
Entropy, Cross-Entropy video, A. Géron
pptx (inked) pdf (inked)
3/23 Mon Cross-validation
Nonparametric Regression
Murphy 1.4.8
Shawe-Taylor, Cristianini, 7.3
pptx (inked) pdf (inked)
3/25 Wed SVM Bishop 7.1 pptx (inked) pdf (inked)
3/30 Mon SVM (Kernel SVM) Murphy 14, 14.5 pptx (inked) pdf (inked)
4/1 Wed Dimensionality Reduction (PCA) Bishop 12.1, Murphy 12.2 pptx (inked) pdf (inked)
4/6 Mon Dimensionality Reduction (Kernel PCA, Autoencoders) Bishop 12.3, Murphy 14.4.4 pptx (inked) pdf (inked)
4/8 Wed Recommender Systems pptx (inked) pdf (inked)
4/13 Mon Clustering (Hierarchical, K-means) Murphy 25.5, Bishop 9.1 pptx (inked) pdf (inked)
4/15 Wed Clustering (EM, GMM) Bishop 9.2 pptx (inked) pdf (inked)
4/20 Mon Learning Theory pptx (inked) pdf (inked) whiteboard.pdf
4/22 Wed Learning Theory pptx (inked) pdf (inked) whiteboard.pdf
4/27 Mon Learning Theory pptx (inked) pdf (inked)
4/29 Wed Ensemble Methods pptx pdf
5/11 Mon FINAL EXAM - 5:30pm - 8:30pm

Recitation

Recitation starts the first week of class, Friday, Jan. 17. Recitation attendence is recommended to help solidfy weekly course topics. That being said, the recitation materials published below are required content and are in-scope for midterm and final exams.

Recitation will be on Friday from 12-12:50 pm. Recitation will (unfortunately) take place in the same room as our lecture hall, Posner A35, rather than individual recitation sections.

Dates Recitation Handout Code
1/17 Fri Recitation 1 pdf (solution) rec1_code.py
1/24 Fri Recitation 2 pdf (solution) rec2.ipynb
1/31 Fri Recitation 3 pdf (solution)
2/7 Fri Recitation 4 pdf (solution)
2/14 Fri Recitation 5 pdf (solution)
2/21 Fri Recitation 6 pdf (solution)
2/28 Fri Recitation 7
3/6 Fri No recitation
3/13 Fri No recitation
3/20 Fri Recitation 8 pdf, DT Guide (solution)
3/27 Fri Recitation 9 pdf (solution)
4/3 Fri Recitation 10 pdf (solution)
4/10 Fri Recitation 11 pdf (solution)
4/17 Fri Recitation 12 pdf (solution)
4/24 Fri Recitation 13 pdf (solution)
5/1 Fri Recitation (Final review) pdf

Exams

The course includes one midterm exam and a final exam. The midterm will be 5-6:30 pm on Mar. 2 (not in class). The final exam date is to-be-determined. Plan any travel around exams, as exams cannot be rescheduled.

Assignments

There will be approximately six homework assignments that will have some combination of written and programming components and approximately six online assignments (subject to change). Written and online components will involve working through algorithms presented in the class, deriving and proving mathematical results, and critically analyzing material presented in class. Programming assignments will involve writing code in Python to implement various algorithms.

For any assignments that aren't released yet, the dates below are tentative and subject to change.

Assignment due dates

Assignment Link (if released) Due Date
HW 1 (online) Gradescope 1/21 Tue, 11:59 pm
HW 2 (written/programming) hw2_blank.pdf, hw2_tex.zip, Programming 2/4 Tue, 11:59 pm
HW 3 (online) Gradescope 2/11 Tue, 11:59 pm
HW 4 (written/programming) hw4_blank.pdf, hw4_tex.zip, Programming 2/24 Mon, 11:59 pm
HW 5 (online) Gradescope 2/27 Thu, 11:59 pm
HW 6 (written/programming) hw6_blank.pdf, hw6_tex.zip, hw6_programming.pdf 3/26 Thu, 11:59 pm
HW 7 (online) Gradescope 3/31 Tue, 11:59 pm
HW 8 (written/programming) hw8_blank.pdf, hw8_tex.zip, Programming 4/9 Thu, 11:59 pm
HW 9 (online) Gradescope 4/16 Thu, 11:59 pm
HW 10 (written/programming) hw10_blank.pdf, hw10_tex.zip, Programming 4/30 Thu, 11:59 pm
HW 11 (online) TBD

Policies

Grading

Grades will be collected and reported in Canvas. Please let us know if you believe there to be an error the grade reported in Canvas.

Final scores will be composed of:

  • 20% Midterm exam
  • 30% Final exam
  • 40% Written/Programming homework
  • 5% Online homework
  • 5% Participation

Participation Grades

Participation will be based on the percentage of in-class polling questions answered:

  • 5% for 80% or greater poll participation
  • 3% for 70%
  • 1% for 60%

Correctness of in-class polling responses will not be taken into account for participation grades.

It is against the course academic integrity policy to answer in-class polls when you are not present in lecture. Violations of this policy will be reported as an academic integrity violation. Information about academic integrity at CMU may be found at https://www.cmu.edu/academic-integrity.

Final Grade

This class is not curved. However, we convert final course scores to letter grades based on grade boundaries that are determined at the end of the semester. What follows is a rough guide to how course grades will be established, not a precise formula — we will fine-tune cutoffs and other details as we see fit after the end of the course. This is meant to help you set expectations and take action if your trajectory in the class does not take you to the grade you are hoping for. So, here's a rough, very rough heuristics about the correlation between final grades and total scores:

  • A: above 90%
  • B: 80-90%
  • C: 70-80%
  • D: 60-70%

This heuristic assumes that the makeup of a student’s grade is not wildly anomalous: exceptionally low overall scores on exams, programming assignments, or written assignments will be treated on a case-by-case basis.

Precise grade cutoffs will not be discussed at any point during or after the semester. For students very close to grade boundaries, instructors may, at their discretion, consider participation in lecture and recitation, exam performance, and overall grade trends when assigning the final grade.

Late Policy

Written/programming homework and online homework:

  • 6 slip days across all assignment types
  • Use up to two per assignment
  • You may use these at your discretion, but they are intended for minor illness and other disruptive events outside of your control, and not for poor time management
  • You are responsible to keep track of your own slip days. Gradescope will not enforce the total number of slip days
  • Homework submitted after these two slip days or submitted by a student without any slip days remaining will be given a score of 0.

Aside from this, there will be no extensions on assignments in general. If you think you really really need an extension on a particular assignment, contact the instructor as soon as possible and before the deadline. Please be aware that extensions are entirely discretionary and will be granted only in exceptional circumstances outside of your control (e.g., due to severe illness or major personal/family emergencies, but not for competitions, club-related events or interviews). The instructors will require confirmation from University Health Services or your academic advisor, as appropriate.
Nearly all situations that make you run late on an assignment homework can be avoided with proper planning — often just starting early. Here are some examples:

  • I have so many deadlines this week: you know your deadlines ahead of time — plan accordingly.
  • It's a minute before the deadline and the network is down: you always have multiple submissions -- it's not a good idea to wait for the deadline for your first submission.
  • My computer crashed and I lost everything: Use Dropbox or similar to do real-time backup -- recover your files onto AFS and finish your homework from a cluster machine.
  • My fraternity/sorority/club has that big event that is taking all my time: Schedule your extra-curricular activities around your classes, not vice versa.

Collaboration Policy

We encourage you to discuss course content and assignments with your classmates. However, these discussion must be kept at a conceptual level only.

  • You may NOT view, share, or communicate about any artifact that will be submitted as part of an assignment. Example artifacts include, but are not limited to: code, pseudocode, diagrams, and text.
  • You may look at another student's code output and discuss it at a conceptual level, as long as it is not output that appears directly in the homework submission.
  • You may look at another student's code error messages and discuss what the error means at a conceptual level. However, you may NOT give specific instructions to fix the error.
  • All work that you present must be your own.
  • Using any external sources of code or algorithms in any way must have approval from the instructor before submitting the work. For example, you must get instructor approval before using an algorithm you found online for implementing a heuristic function in a programming assignment.

Violations of these policies will be reported as an academic integrity violation. Information about academic integrity at CMU may be found at https://www.cmu.edu/academic-integrity. Please contact the instructor if you ever have any questions regarding academic integrity or these collaboration policies.

Accommodations for Students with Disabilities

If you have a disability and have an accommodations letter from the Disability Resources office, we encourage you to discuss your accommodations and needs with us as early in the semester as possible. We will work with you to ensure that accommodations are provided as appropriate. If you suspect that you may have a disability and would benefit from accommodations but are not yet registered with the Office of Disability Resources, we encourage you to visit their website.

Statement of Support for Students’ Health & Well-being

Take care of yourself. Do your best to maintain a healthy lifestyle this semester by eating well, exercising, getting enough sleep, and taking some time to relax. This will help you achieve your goals and cope with stress.
All of us benefit from support during times of struggle. There are many helpful resources available on campus and an important part of the college experience is learning how to ask for help. Asking for support sooner rather than later is almost always helpful.
If you or anyone you know experiences any academic stress, difficult life events, or feelings like anxiety or depression, we strongly encourage you to seek support. Counseling and Psychological Services (CaPS) is here to help: call 412-268-2922 and visit their website at http://www.cmu.edu/counseling/. Consider reaching out to a friend, faculty or family member you trust for help getting connected to the support that can help.
If you have questions about this or your coursework, please let us know. Thank you, and have a great semester.