** Instructor: ** Aditi Raghunathan (raditi at cmu dot edu)

** TA: ** Christina Baek (kbaek at cs dot cmu dot edu)

**Lectures:** Tuesday, Thursday 4:40-6:00pm at GHC 4102

**Overview: **

In this advanced machine learning seminar class, we tackle the typical struggle in using the powerful deep learning machinery: *what works and why?* We build a conceptual understanding of deep learning through several different angles: standard in-distribution generalization, out-of-distribution generalization, self-supervised learning, scaling laws, memorization etc. We will read papers that contain a mix of theoretical and empirical insights with a focus on making connections to classic ideas, identifying recurring themes, and discussing avenues for future developments. The class aims to equip students with the ability to critically reason about and build a more principled understanding of current advances which will hopefully spark their own research.

**Format:**

This course combines lectures with paper presentations
by the students, encouraging both fundamental knowledge
acquisition as well as open-ended discussions and new research directions.
The lectures will briefly introduce the main concepts, summarize a few key papers
and connect to classical ideas if applicable.

The paper discussions will involve * role-playing student seminars * inspired by Alec Jacobson and Colin Raffel.
We will be adopting the following roles.

- Positive reviewer: who advocates for the paper to be
*accepted*at a conference (e.g., NeurIPS) - Negative reviewer: who advocates for the paper to be
*rejected*at a conference (e.g., NeurIPS) - Archaeologist: who determines where this paper sits in the context of previous and subsequent work. They must find and report on atleast one
*older*paper cited within the current paper that substantially influenced the current paper and atleast one*newer*paper that cites this current paper. Keep an eye out for follow-up work that contradicts the takeaways in the current paper - Academic researcher: who proposes potential follow-up projects not just based on the current paper but also only possible due to the existence and success of the current paper
- Visitor from the past: who is a researcher from the early 2000s. They must discuss how they comprehend the results of the paper, what they like or dislike about the settings and benchmarks considered, and what surprises them the most about presented results

**Prerequisites:**There are no official prerequisites but a knowledge of probability, linear algebra, machine learning is expected.

**Course requirements:**

- Regular participation (25%): Written summaries of assigned readings must be submitted before each class, plus participation in online discussion
- Paper presentation (40%): A student must present 1-2 paper presentations throughout the class. A paper will be presented by 2 students where each student takes on the role of either a positive or negative reviewer and one other role from the list above
- Class participation during lectures and paper discussions (10%)
- Final project (25%) if taking for letter grade

**Important dates:**- Project proposal: Oct 13 2022
- Midway project check: Nov 15 2022
- Project reports in style of NeurIPS paper: Dec 12 2022
- Final project presentations: Lectures starting from Dec 15 2022

**Topics (tentative):**- Generalization in deep learning (uniform convergence, NTK, …)
- Implicit biases (algorithmic regularization, simplicity bias, …)
- Brittleness and robust training (min-max robustness, spurious correlations, domain invariance, …)
- ML with unlabeled data (semi-supervised learning, self-supervised learning, …)
- Adaptation (fine-tuning, few-shot learning, continual learning, …)
- Large language models (transformers, in-context learning, prompt tuning, scaling laws, …)
- Implications on security/privacy, fairness and ethics

**Schedule:**Date Topic Content Presenter 08/30/2022 **[Lecture 1]**Introduction- Why does this course exist?
- Course logistics
- Overview of the course

Aditi Raghunathan 09/01/2022 **[Lecture 2]**The generalization puzzleUniform convergence, implicit regularization Aditi Raghunathan 09/06/2022 **[Paper discussion 1]**Generalization- The tradeoffs of large scale learning
- The lottery ticket hypothesis: finding sparse, trainable neural networks

09/08/2022 **[Paper discussion 2]**Generalization09/13/2022 **[Guest Lecture]**Limitations of uniform convergence Vaishnavh Nagarajan 09/15/2022 **[Lecture 3]**Phenomena captured by simpler modelsDouble descent, bias-variance tradeoff, kernel methods 09/20/2022 **[Paper discussion 3]**- Neural Tangent Kernel: convergence and generalization in Neural Networks
- Benign overfitting in linear regression

09/22/2022 **[Lecture 4]**Robustness of deep networksOut-of-distribution generalization, adversarial examples, spurious correlations, shortcut learning and simplicity bias Aditi Raghunathan 09/27/2022 **[Paper discussion 4]**Why are models brittle? (I)09/29/2022 **[Lecture 5]**Robust training of deep networksRobust optimization, accuracy tradeoff, effect of overparameterization Aditi Raghunathan 10/04/2022 **[Paper discussion 5]**Why are models brittle? (II)10/06/2022 **[Lecture 6]**Data poisoning, causalityDiscussion of data poisoning, intro to causality Aditi Raghunathan 10/11/2022 **[Paper discussion 6]**Causality10/13/2022 **[Lecture 7]**Unlabeled data-IA brief history Aditi Raghunathan 10/18/2022 **[Fall Break]**10/20/2022 **[Fall Break]**10/25/2022 **[Paper discussion 7]**Learning from unlabeled data10/27/2022 **[Guest Lecture]**Self-supervised learning Alexei A. Efros 11/1/2022 **[Paper discussion 8]**Learning from unlabeled data11/3/2022 **[Lecture 8]**Unlabeled data-IIAnalysis of self-training, self-supervision and domain adaptation methods 11/8/2022 **[Paper discussion 9]**Distribution shifts with access to unlabeled data11/10/2022 **[Lecture 9]**Foundation modelsTransfer learning, analysis of fine-tuning methods, in-context learning 11/15/2022 **[Guest lecture]**Graham Neubig, Maarten Sap 11/17/2022 **[Paper discussion 10]**11/22/2022 **[Paper discussion 11]**11/24/2022 **[Thanksgiving break]**11/29/2022 **[NeurIPS break]**12/1/2022 **[NeurIPS break]**12/06/2022 **[Guest Lecture]**Privacy and fairness in modern machine learning Nicholas Carlini 12/08/2022 **[Guest Lecture]**Benchmarking large language models Rishi Bommasani 12/13/2022 **[Paper presentation 12]**12/15/2022 **[Project presentations]**12/20/2022 **[Project presentations]**