10-301/601: Introduction to Machine Learning

Summer 2022

Narwhal Net

Key Information and Links

Instructor: Henry Chai
Education Associate: Brynn Edmunds
Lectures: Monday, Tuesdays and Wednesdays from 2 PM to 3:20 PM (EDT) in PH 100. Lectures will be livestreamed via Zoom and recorded; the recordings will be hosted by Panopto.
In-class Polls: Some portion of your grade will be determined based on participation in polls during lecture. The latest poll can always be found at; in order to respond, you must be logged in with your CMU email.
Recitations: Thursdays from 2 PM to 3:20 PM (EDT) in PH 100. Attendance at recitations is optional and therefore, outside of extraordinary circumstances, these will not be live-streamed or recorded. Recitation handouts can be found under the Recitations tab.
Announcements/Q&A: We will be using Ed for making announcements and answering questions.
Homeworks: Homework handouts will be posted to the course website under the Assignments tab. Homeworks should be submitted via Gradescope.
Office Hours: The time and location of office hours can be found on the course calendar.


1. Course Description

Machine Learning is concerned with computer programs that automatically improve their performance through experience, e.g., programs that learn to recognize human faces, recommend music and movies, and drive autonomous robots. This course covers the theory and practical algorithms for machine learning from a variety of perspectives. Specific topics include Bayesian networks, decision tree learning, support vector machines, statistical learning methods, unsupervised learning and reinforcement learning as well as theoretical concepts such as inductive bias, the PAC learning framework, Bayesian learning methods, margin-based learning, and Occam’s Razor. Programming assignments include hands-on experiments with various learning algorithms. This course is designed to give a graduate-level student a thorough grounding in the methodologies, technologies, mathematics and algorithms currently needed by people who do research in machine learning.

10-301 and 10-601 are identical. Undergraduates must register for 10-301 and graduate students must register for 10-601.

Learning Outcomes: By the end of the course, students should be able to:

2. Prerequisites

Students entering the class are expected to have a pre-existing working knowledge of probability, linear algebra, statistics and algorithms; some recitation sessions will be held to review basic concepts.

  1. You need to have, before starting this course, significant experience programming in a general programming language. Specifically, you need to have written from scratch programs consisting of several hundred lines of code. For undergraduate students, this will be satisfied for example by having passed 15-122 (Principles of Imperative Computation) with a grade of ‘C’ or higher, or comparable courses or experience elsewhere.

    Note: For each programming assignment, you will be required to use Python. You will be expected to know, or be able to quickly pick up, that programming language.

  2. You need to have, before starting this course, basic familiarity with probability and statistics, as can be achieved at CMU by having passed 36-217 (Probability Theory and Random Processes) or 36-225 (Introduction to Probability and Statistics I), or 15-359, or 21-325, or comparable courses elsewhere, with a grade of ‘C’ or higher.
  3. You need to have, before starting this course, college-level maturity in discrete mathematics, as can be achieved at CMU by having passed 21-127 (Concepts of Mathematics) or 15-151 (Mathematical Foundations of Computer Science), or comparable courses elsewhere, with a grade of ‘C’ or higher.

You must strictly adhere to these pre-requisites! Even if CMU’s registration system does not prevent you from registering for this course, it is still your responsibility to make sure you have all of these prerequisites before you register.

3. Recommended Textbooks

This course does not exactly follow any one textbook. However, most lectures will have some optional reading to help you better understand the material or see a different presentation/perspective. We recommend you read these after the corresponding lecture. These readings will be drawn from the following texts, many of which are freely available online:

The textbook below is a great resource for those hoping to brush up on the prerequisite mathematics background for this course:

4. Course Components

The graded components of this course consist of participation in lectures, midterm and final exams, and homework assignments. The breakdown is as follows:

We will convert numerical course grades to letter grades based on grade boundaries that are determined at the end of the semester. The following is a list of upper bounds on the grade cutoffs we will use; in all likelihood, these will be adjusted down at the end of the semester:


There are two types of homework assignments in this course: programming and written. The programming assignments will ask you to implement machine learning algorithms from scratch; they emphasize understanding of real-world applications, building end-to-end systems, and experimental design. The written assignments will focus on core concepts, “on-paper” implementations of classic learning algorithms, derivations, and understanding of theory.

LaTeX is a valuable tool for communicating machine learning concepts to others. In order to encourage you to use LaTeX, we will give you 1 bonus point on each homework that you write up entirely in LaTeX. We will always release a LaTeX starter template.

Midterm and Final Exams

You are required to attend all the exams. Unless otherwise noted, all exams will be closed-book. You may bring one sheet of A4 or letter-sized paper as a cheatsheet (both back and front may be used). You are encouraged to handwrite this cheatsheet as a form of preparing for the exam but you may typeset it if you so choose.

The midterm exams occur during class while the final exam will be scheduled by the registrar sometime during the official final exams period. The dates of all these exams can be found on the lecture schedule; please plan your travel accordingly as we will not be able accommodate individual travel needs.

If you have an unavoidable conflict with an exam (e.g., an exam in another course), notify us by making a private post on Ed.


We will be using PollEverywhere for in-class polls. In order to access these polls, you must create an account using your CMU email address. You can always access the latest poll at

You will always be allowed to submit multiple times so if there are multiple questions during a lecture, you should submit multiple times. Your participation grade will be based on the percentage of in-class polls answered:

The correctness of your responses will not be taken into account when computing participation grades, all that matters is that you submit something. All in-class polls will only be live until the start of the next lecture or recitation (roughly a 24-hour period); you will receive 50% credit for any poll you respond to after the corresponding lecture ends, i.e., if you were to respond to every poll after the end of the corresponding lecture, then your overall participation grade would be 1%.

5. Office Hours

The schedule of office hours will always appear on the course calendar. All office hours will be held in-person. Instructor office hours will (usually) be held immediately after class in either the classroom or a nearby space. We encourage you to stick around and ask any questions you have about lecture material, homework problems, exam prepation, course logistics, etc...

In office hours, when it is your turn, you should pose your question to the TA(s) and they will determine whether or not your question would be best addressed privately or publicly, i.e., to anyone in the room who wants to listen in.

We will make use of the following (informal) rules:

While you're awaiting your turn, we encourage you to listen in to the answers to any publicly answered questions. Please be courteous and allow the student who posed the question to primarily direct the discussion with the TA. We also encourage you to collaborate with others (following our collaboration policies below) while waiting.

6. General Policies

Late homework policy

Late homework submissions are only eligible for 75% of the points the first day (24-hour period) after the deadline, 50% the second, and 25% the third.

You have a total of 9 grace days for use on any homework assignment. We will automatically keep a tally of these grace days for you; they will be applied greedily. No assignment will be accepted more than 3 days after the deadline. This has two important implications: (1) you may not use more than 3 graces days on any single assignment (2) you may not combine grace days with the late policy above to submit more than 3 days late.

HW3, HW6, and HW9 will not be accepted more than 1 day after the deadline, so that we can hold the solution session before the subsequent exams. To ensure you receive graded feedback before the exams, you must submit HW3, HW6, HW9 on time.

All homeworks will be submitted electronically via Gradescope. As such, lateness will be determined by the latest timestamp of any part of your submission. For example, suppose the homework requires two submission uploads – if you submit the first upload on time but the second upload 1 minute late, your entire homework will be penalized for the full 24-hour period.


In general, we do not grant extensions on assignments. There are several exceptions:

For any of the above situations, you may request an extension by emailing Brynn bedmunds@andrew.cmu.edudo not email the instructor or TAs. Please be specific about which assessment(s) you are requesting an extension for and the number of hours requested. The email should be sent as soon as you are aware of the conflict and at least 5 days prior to the deadline. In the case of an emergency, no notice is needed.

If this is a medical emergency or mental health crisis, you must also CC your CMU college liaison and/or your academic advisor. Do not submit any medical documentation to the course staff. If necessary, your college liaison and The Division of Student Affairs (DoSA) will request such documentation and they will view the health documentation and conclude whether a retroactive extension is appropriate. If you haven’t interacted with your college liaison before, they are experienced student affairs staff who work in partnership with students, housefellows, advisors, faculty, and associate deans in each college to assure support for students regarding their overall Carnegie Mellon experience.

Audit Policy

Formal auditing of this course is permitted. You must follow the official procedures for a course audit as outlined by the HUB/registrar. Please do not email the instructor requesting permission to audit. Instead, you should first register for the appropriate section. Next fill out the Course Audit Approval form and obtain the instructor’s signature in-person immediately after class.

Auditors are required to:

Pass/Fail Policy

You are allowed to take tis course as Pass/Fail; instructor permission is not required. What letter grade is the cutoff for a Pass will depend on your specific program; we do not specify whether or not you Pass but rather we compute your letter grade the same as everyone else in the class and your program converts that letter grade to a Pass or Fail depending on their cutoff. Be sure to check with your program/department as to whether you can count a Pass/Fail course towards your degree requirements.

Accommodations for Students with Disabilities

If you have a disability and have an accommodations letter from the Disability Resources office, please email Brynn at to set up a meeting for the purposes of discussing your accommodations and needs as early in the semester as possible. She will work with you to ensure that accommodations are provided as appropriate. If you suspect that you may have a disability and would benefit from accommodations but are not yet registered with the Office of Disability Resources, I encourage you to contact them at

7. Technologies

We will use a variety of technologies throughout the summer:

8. Collaboration and Academic Integrity

Read this carefully!

Collaboration among Students

The purpose of student collaboration is to facilitate learning, not to circumvent it. Studying the material in groups is strongly encouraged. You are also allowed to seek help from other students in understanding the material needed to solve a particular homework problem, provided any written notes (including code) are taken on an impermanent surface (e.g., whiteboard, chalkboard), and provided learning is facilitated, not circumvented. The actual solution must be written by each student alone.

A good method to follow when collaborating is to meet with your peers, discuss ideas at a high level, but do not copy down any notes from each other or from a white board. Any scratch work done at this time should be your own only. Before writing the assignment solutions, you should make sure that you are doing this without anyone else present, putting all notes away, closing all tabs on your computer, and writing it completely by yourself with no other resources.

You are absolutely not allowed to share/compare answers or screen share your work with one another.

The presence or absence of any form of help or collaboration, whether given or received, must be explicitly stated and disclosed in full by all involved. Specifically, each assignment solution must include answers to the following questions:

  1. Did you receive any help whatsoever from anyone in solving this assignment? Yes / No.
    • If you answered ‘yes’, give full details: ____________
    • (e.g., "Jane Doe explained to me what is asked in Question 3.4")
  2. Did you give any help whatsoever to anyone in solving this assignment? Yes / No.
    • If you answered ‘yes’, give full details: _____________
    • (e.g., "I pointed Joe Smith to section 2.3 since he didn’t know how to proceed with Question 2")
  3. Did you find or come across code that implements any part of this assignment? Yes / No. (See below policy on "found code")
    • If you answered ‘yes’, give full details: _____________
    • (book & page, URL & location within the page, etc.).

If you gave help after turning in your own assignment and/or after answering the questions above, you must update your answers before the assignment’s deadline, if necessary by emailing the course staff.

Collaboration without full disclosure will be handled severely, in compliance with CMU’s Policy on Academic Integrity.

Previously Used Assignments

Some of the homework assignments used in this class may have been used in prior offerings, in classes at other institutions, or elsewhere. Solutions to them may be, or may have been, available online, or from other people or sources. It is explicitly forbidden to use any such sources, or to consult people who have solved these problems before. It is explicitly forbidden to search for these problems or their solutions on the internet. You must solve the homework assignments completely on your own. We will be actively monitoring your compliance. Collaboration with other students who are currently taking the class is allowed, but only under the conditions stated above.

Policy Regarding "Found Code"

You are encouraged to read books and other instructional materials, both online and offline, to help you understand the concepts and algorithms taught in class. These materials may contain example code or pseudocode, which may help you better understand an algorithm or an implementation detail. However, when you implement your own solution to an assignment, you must put all materials aside, and write your code completely on your own, starting "from scratch". Specifically, you may not use any code you found or came across. If you find or come across code that implements any part of your assignment, you must disclose this fact in your collaboration statement.

Duty to Protect One’s Work

Students are responsible for proactively protecting their work from copying and misuse by other students. If a student’s work is copied by another student, the original author is also considered to be in violation of the course policies. It does not matter whether the author allowed the work to be copied or was merely negligent in preventing it from being copied. When overlapping work is submitted by different students, both students will be punished.

To protect future students, do not post your solutions publicly, neither during the course nor afterwards.

Penalties for Violations of Course Policies

All violations of course policies (even the first one) will always be reported to the university authorities (your department head, associate dean, the dean of Student Affairs, etc.) as an official Academic Integrity Violation and will carry severe penalties.

  1. The penalty for the first violation is a negative 100% on the assignment i.e., it would have been better to submit nothing and receive a 0%.
  2. The penalty for the second violation is failure in the course, and can even lead to dismissal from the university.

9. Support

Take care of yourself. Do your best to maintain a healthy lifestyle by eating well, exercising, avoiding drugs and alcohol, getting enough sleep and taking some time to relax. This will help you achieve your goals and cope with stress.

All of us benefit from support during times of struggle. You are not alone. There are many helpful resources available on campus and an important part of the college experience is learning how to ask for help. Asking for support sooner rather than later is often helpful.

If you or anyone you know experiences any academic stress, difficult life events, or feelings like anxiety or depression, we strongly encourage you to seek support. Counseling and Psychological Services (CaPS) is here to help: call 412-268-2922 and visit their website at

If you or someone you know is feeling suicidal or in danger of self-harm, call someone immediately, day or night:

10. Diversity

We must treat every individual with respect. We are diverse in many ways, and this diversity is fundamental to building and maintaining an equitable and inclusive campus community. Diversity can refer to multiple ways that we identify ourselves, including but not limited to race, color, national origin, language, sex, disability, age, sexual orientation, gender identity, religion, creed, ancestry, belief, veteran status, or genetic information. Each of these diverse identities, along with many others not mentioned here, shape the perspectives our students, faculty, and staff bring to our campus. We, at CMU, will work to promote diversity, equity and inclusion not only because diversity fuels excellence and innovation, but because we want to pursue justice. We acknowledge our imperfections while we also fully commit to the work, inside and outside of our classrooms, of building and sustaining a campus community that increasingly embraces these core values.

Each of us is responsible for creating a safer, more inclusive environment.

Unfortunately, incidents of bias or discrimination do occur, whether intentional or unintentional. They contribute to creating an unwelcoming environment for individuals and groups at the university. Therefore, the university encourages anyone who experiences or observes unfair or hostile treatment on the basis of identity to speak out for justice and support, within the moment of the incident or after the incident has passed. Anyone can share these experiences using the following resources:

All reports will be documented and deliberated to determine if there should be any following actions. Regardless of incident type, the university will use all shared experiences to transform our campus climate to be more equitable and just.



Henry Chai

Education Associate

Brynn Edmunds

Teaching Assistants

Ayush Khandelwal
Boyang (Jack) Lyu
Brendon Gu

Chutian Weng
Sana Lakdawala

Class Mascot

Neural the Narwhal


Date Topic Slides Readings/Resources
Mon, 5/16 No Class: Instead watch this welcome video Logistics.pdf
Tue, 5/17 Notation and Problem Formulation Lecture 1 (Inked)
Wed, 5/18 Decision Trees Lecture 2 (Inked) Daumé III, Chapter 1: Decision Trees
Supplementary ID3 Example
Olah, Visual Information Theory (blog post)
Mon, 5/23 Nearest Neighbors Lecture 3 (Inked) Daumé III, Chapter 3: Geometry and Nearest Neighbors
Warm-up Handout
Tue, 5/24 Model Selection Lecture 4 (Inked) Daumé III, Chapter 2: Limits of Learning
Wed, 5/25 Perceptron Lecture 5 (Inked)
Mon, 5/30 No Class (Memorial Day)
Tue, 5/31 Linear Regression Lecture 6 (Inked) Murphy, Chapters 7.1-7.3
Wed, 6/01 MLE/MAP Lecture 7 (Inked) Mitchell, Estimating Probabilities
Mon, 6/06 Naïve Bayes Lecture 8 (Inked) Mitchell, Naive Bayes and Logistic Regression
Murphy, Chapter 3.5
Tue, 6/07 Logistic Regression Lecture 9 (Inked) Mitchell, Naive Bayes and Logistic Regression
Murphy, Chapters 8.1-8.3
Wed, 6/08 Feature Engineering and Regularization Lecture 10 (Inked) Murphy, Chapter 7.5
Mon, 6/13 Exam 1 Review Lecture 11 (Inked)
Tue, 6/14 Exam 1 (In-class)
Wed, 6/15 Neural Networks Lecture 12 (No Ink) Mitchell, Chapters 4.1-4.6
Multiclass Classification Handout
Mon, 6/20 No Class (Juneteenth)
Tue, 6/21 Backpropagation Lecture 13 (Inked) Mitchell, Chapters 4.1-4.6
Wed, 6/22 Deep Learning Lecture 14 (Inked)
Mon, 6/27 No Class (Summer Break)
Tue, 6/28 No Class (Summer Break)
Wed, 6/29 No Class (Summer Break)
Mon, 7/04 No Class (Independence Day)
Tue, 7/05 Learning Theory Lecture 15 (Inked) Mitchell, Chapters 7.1-7.3
Wed, 7/06 Learning Theory Lecture 16 (Inked) Mitchell, Chapter 7.4
Mon, 7/11 Bayesian Networks Lecture 17 (Inked) Murphy, Chapters 10.1-10.5
Tue, 7/12 Hidden Markov Models Lecture 18 (Inked) Murphy, Chapters 17.1-17.5
Wed, 7/13 Hidden Markov Models Lecture 19 (Inked) Murphy, Chapters 17.1-17.5
Mon, 7/18 Exam 2 Review Lecture 20 (Inked)
Tue, 7/19 Exam 2 (In-class)
Wed, 7/20 Markov Decision Processes Lecture 21 (Inked) Mitchell, Chapter 13
Mon, 7/25 Value & Policy Iteration Lecture 22 (Inked) Mitchell, Chapter 13
Tue, 7/26 Q-learning/Deep Reinforcement Learning Lecture 23 (Inked) Mitchell, Chapter 13
Wed, 7/27 Clustering Lecture 24 (Inked) Daumé III, Chapter 15: Unsupervised Learning
Murphy, Chapters 25.5.1-25.5.2
Mon, 8/01 Dimensionality Reduction Lecture 25 (Inked) Daumé III, Chapter 15: Unsupervised Learning
Murphy, Chapters 12.2.1-12.2.3
Tue, 8/02 Random Forests Lecture 26 (Inked)
Wed, 8/03 Boosting Lecture 27 (Inked) Schapire, The Boosting Approach to Machine Learning: An Overview (2001)
Mon, 8/08 Algorithmic Bias Lecture 28 (Inked)
Tue, 8/09 Exam 3 Review Lecture 29 (Inked)
Fri, 8/12 (4:00 - 5:20 PM) Exam 3


Attendance at recitations is not required, but strongly encouraged. Recitations will be interactive and focus on problem solving; we strongly encourage you to actively participate. A problem sheet will usually be released prior to the recitation. If you are unable to attend one or you missed an important detail, feel free to stop by office hours to ask the TAs about the content that was covered. Of course, we also encourage you to exchange notes with your peers.

Date Topic Handout
Thu, 5/19 Recitation 1: HW1 Recitation 1 (Solutions)
Thu, 5/26 Recitation 2: HW2 Recitation 2 (Solutions)
Thu, 6/02 Recitation 3: HW3 Recitation 3 (Solutions)
Thu, 6/09 Exam 1 Review Exam 1 Practice Problems (Solutions)
Thu, 6/16 Recitation 4: HW4 Recitation 4 (Solutions)
Thu, 6/23 Recitation 5: HW5 Recitation 5 (Solutions)
Thu, 6/30 No Recitation (Summer Break)
Thu, 7/07 Recitation 6: HW6 Recitation 6 (Solutions)
Thu, 7/14 Exam 2 Review Exam 2 Practice Problems (Solutions)
Thu, 7/21 Recitation 7: HW7 Recitation 7 (Solutions)
Thu, 7/28 Recitation 8: HW8 Recitation 8 (Solutions)
Thu, 8/04 Recitation 9: HW9 Recitation 9 (Solutions)
Wed, 8/10 Exam 3 Review Exam 3 Practice Problems (Solutions)


Release Date Topic Files Due Date
Tue, 5/17 HW1: Background Material HW1, Overleaf Tue, 5/24 at 1:00 PM
Tue, 5/24 HW2: Decision Trees HW2, Overleaf Tue, 5/31 at 1:00 PM
Tue, 5/31 HW3: KNN, Perceptron, and Linear Regression HW3, Overleaf Tue, 6/07 at 1:00 PM
Wed, 6/15 HW4: Logistic Regression HW4, Overleaf Wed, 6/22 at 1:00 PM
Wed, 6/22 HW5: Neural Networks HW5, Overleaf Wed, 7/06 at 1:00 PM
Wed, 7/06 HW6: Generative Models and Learning Theory HW6, Overleaf Wed, 7/13 at 1:00 PM
Wed, 7/20 HW7: Graphical Models HW7, Overleaf Wed, 7/27 at 1:00 PM
Wed, 7/27 HW8: Reinforcement Learning HW8, Overleaf Wed, 8/03 at 1:00 PM
Wed, 8/03 HW9: Unsupervised Learning and Ensemble Methods HW9, Overleaf Tue, 8/09 at 1:00 PM

Course Calendar