Data Science for Product Managers

Canvas:

https://canvas.cmu.edu/courses/45491

Semester:

2025 Spring (05-898, B4)

Instructors:

Sherry Tongshuang Wu (Office hour: Wednesday 2-3pm, NSH 3525)

Tas:

Jaehee Kim (Office hour: Wednesdays 2:30-3:30pm, GHC 7501)

,

Yeonji Baek (Office hour: Fridays, 2-3pm, GHC 7101)

Time:

Monday / Wednesday 12:30-01:50pm

Location:

NSH 3002

The goal of this course is to provide you with the tools to understand and perform data science as it relates to product managers. You will learn and perform customer focused data analysis through a combination of lectures, readings, and practical skills development. Over the course of the semester, you will learn about data science and the entire data pipeline from collecting and analyzing to interacting with data. This course assignment will involve programming in Python (pandas, scikit-learn, altair, etc.), but you will be allowed to complete the assignments with GenAI. The learning goals of the course are as follows:

  • To understand basic data manipulation and wrangling
  • To introduce common problems with data such as structural problems, outliers, incomplete data, and dirty data
  • To introduce basic concepts in data interpretation including feature generation, statistical analysis and classification (assumptions of data, bad data, missing data, outliers and winsoring, data shape)
  • To introduce concepts in exploratory data science and the use of visualization
  • To provide practical applied examples of the data science using machine learning techniques
  • To explore and understand Data Science Ethics and ML Ethics

Schedule and Readings

This schedule is tentative and subject to changes.

Kickoff Session
Mon, Mar 10
Welcome & Introduction (Lecture)
Course overview, logistics, gradings.
Slides
Required What it takes to become a great product manager by Julia Austin in 2017
Wed, Mar 12
Topic preview via in-class assignment (Christina Ma) (Discussion)
Please come to this lecture! It will give you a preview of all the assignments you will be doing in class. You will complete your HW0 in class and make sure you have the right setup for all the assignments afterwards.
Slides
Deadline Pre-survey
Deadline Assignment 0: Simplified Data Science pipeline preview
Data quality and provenance
Mon, Mar 17
Data Quality and Wrangling 1 (Lecture)
Understand data models, especially structured, semi-structured, and unstructured data. Intro to data qualities.
Slides
Required Guide to bad data by Quartz in Github
Wed, Mar 19
Data Quality and Wrangling 2 (Lecture)
Data cleaning, data transformation, data integration, data reduction
Slides
Mon, Mar 24
Data Collection, Biases, and Provenance (Lecture)
Annotation design, sampling biases, provenance, versioning
Slides
Deadline Assignment 1: Data Cleaning
Optional Hidden Technical Debt in Machine Learning Systems by Sculley, D. et al.
Get data insights
Wed, Mar 26
Exploratory Data Analysis (Lecture)
The importance of EDA and some case studies
Slides
Mon, Mar 31
Data Visualization (Lecture)
Concepts in data visualization, visual encoding, tools in visualization
Slides
Required Information Visualization (Chapter 1) by Stuart Card, Jock Mackinlay, and Ben Shneiderman in 1999
Required Getting Started by Marian Dörk
Required Data Types, Graphical Marks, and Visual Encoding Channels by Jeffrey Heer, Dominik Moritz, Jake VanderPlas, and Brock Craf
Wed, Apr 02
Statistical analysis and featurization (Lecture)
Dimension reduction, segmentation, clustering, pattern mining, statistics
Slides
Mon, Apr 07
Practical Machine Learning 1 (Lecture)
Supervised and unsupervised learning, classification, regression, clustering
Slides
Deadline Assignment 2: Exploratory Data Analysis
Required A Few Useful Things to Know about Machine Learning by Pedro Domingos in Communications of the ACM 2012
Wed, Apr 09
Practical Machine Learning 2 (Lecture)
Concepts relevant to large languages and how they can be used.
Slides
Required How to thrive in a ChatGPT world by Carl T. Bergstrom, Jevin D. West in 2025
Communicate with Data
Mon, Apr 14
Story telling with Data (Lecture)
Concepts in data visualization, visual encoding, tools in visualization
Slides
Deadline Assignment 3: Machine Learning
Required Storytelling with data visualization by Microsoft Power BI in 2025
Required The Stories We Tell About Data: Media Types for Data-Driven Storytelling by Zhenpeng Zhao, Niklas Elmqvist in 2022
Wed, Apr 16
Ethical Data Science (Lecture)
Ethical considerations in data science, e.g. how to collect and interpret data, bias, fairness, and accountability
Slides
Required Doing Data Science (your Andrew ID will get you access) by O'Neil and Schutt in (pp. 356-362. (Chapter 16, starting at the Being an Ethical Data Scientist section through the end of the chapter))
Optional Hype vs. Reality at the MIT Media Lab by Nell Gluckman in Cambridge, Mass.
Mon, Apr 21
Guest Lecture (Debashish Sasmal, UPMC) (Lecture) Slides
Deadline Assignment 4: Reflection and Report
Deadline Post-survey
Wed, Apr 23
Recap and Presentation (Discussion) Slides

Syllabus

Prerequisites

The class will involve programming and debugging! You should not take the course if you find programming or debugging extremely difficult because you will have to master several programming languages/concepts/libraries in very short order. That being said, the assignments that require these will have useful resources for brushing up on the topics.

Required Textbooks

There is no required textbook for this course. Readings are drawn from a variety of books, readings and online postings, and will be provided by the instructor.

Amount of Work

This is a “6 unit” mini. As per university policy, this means that this course is expected to take students 12 hours per week, including class time. Surveys of previous students show that this is accurate.

Course Materials and Communications

Attendance

Lectures will be held in-person twice a week. A good portion of the learning in any class comes from intelligent discussion. If you don’t attend class, you cannot participate, and your performance in the class will reflect that. Rather than taking attendance, there will be pop quizzes and also artifacts collected at the end of class that were generated from in-class activities.

In case the class transitions from in-person to online, the classes will be held synchronously via Zoom. It would be highly appreciated if your video were on. I expect your full attention, professionalism, and interactive participation as if this were a real in-person class. This arrangement is not to place undue stress on you, but rather provide the best educational experience.

Excused absences this course accepts are medical and family emergencies, academic conference travel, religious events, and a small set of approved collegiate activities. If in doubt, contact me to find a solution. Note that interviews, family vacations, weddings, sleeping through alarms, etc. are not excused. Your lowest two participation grades will be dropped, allowing you to miss up to two classes without impacting your grade.

Homework and Quizzes

You will have homework assignments each week. Each week there may also be a quiz based on the lecture content which you will complete via canvas. I will drop the 2 lowest quiz grades.

All assignments in this course are individual: you are required to do them by yourself.

Grading

Homework will be posted to canvas. The due date is posted as well. Each day late will result in a 10% deduction (up to a maximum of 50% off). Students caught cheating or plagiarizing will receive no credit for the assignment. Additional actions – including assigning the student a failing grade in the class or referring the case for disciplinary action – may be taken at the discretion of the instructor. Please note that Canvas has automated plagiarism detection built in now, so please do not cheat or turn-in uncited work.

Your final grade in this course will be based on:

Incompletes & Pass/Fail

It is the policy of this class to not give incompletes. Several assignments have in-class components, so you will need to have each one finished on time. There is no option to take DHCS pass/fail.

Other Information

Diversity, Equity, and Inclusion

Among the many topics in this class, we will discuss many that relate to diversity, equity, and inclusion. As your professor, I am committed to fostering and supporting an inclusive environment in my class (which extends beyond the physical room). It is our goal that students from all diverse backgrounds and perspectives are well served by this course, that students’ learning needs be addressed both in and out of class, and that the diversity that students bring to this class be viewed as a resource, strength, and benefit. Dimensions of diversity include race, age, national origin, ethnicity, gender identity and expression, intellectual and physical ability, sexual orientation, faith and non-faith perspectives, socio-economic class, political ideology, education, primary language, family status, military experience, cognitive style, and communication style. We are intentional in our aim to present materials and activities that are respectful of diversity, based on these dimensions and any other visible and invisible differences not captured in this list. Your suggestions for ensuring that the class lives up to these values are encouraged and appreciated.

Accommodations for Students with Disabilities

If you have a disability and are registered with the Office of Disability Resources, we encourage you to use their online system to notify us of your accommodations and discuss your needs with us as early in the semester as possible. We will work with you to ensure that accommodations are provided as appropriate. If you suspect that you may have a disability and would benefit from accommodations but are not yet registered with the Office of Disability Resources, we encourage you to contact them at access@andrew.cmu.edu.

Health and Well-being

If you are experiencing COVID-like symptoms or have a recent COVID exposure, do not attend class if we are meeting in-person. Please email the instructors for accomodations.

If you or anyone you know experiences any academic stress, difficult life events, or feelings like anxiety or depression, we strongly encourage you to seek support. Counseling and Psychological Services (CaPS) is here to help; call 412-268-2922 and visit their website at www.cmu.edu/counseling/. Consider reaching out to a friend, faculty or family member you trust for help getting connected to the support that can help. If you or someone you know is feeling suicidal or in danger of self-harm, call someone immediately, day or night:

If the situation is life threatening, call the police. On campus call CMU Police: 412-268-2323. Off campus: 911.

If you have questions about this, please let the instructors know. Thank you, and have a great semester.