Probabilistic Programming Languages

CMU 15-819

Instructors: Feras Saad and Jan Hoffman
Location: Wean Hall (WEH) 5312
Date and Time: Monday / Wednesday, 09:30–10:50
Contact: ppl-instructors@cs.cmu.edu

Course Info

Probabilistic programming is an approach to computing based on the idea that probabilistic models can be naturally and efficiently represented as executable code. This idea has enabled researchers to formalize, automate, and scale up modeling and inference; to make modeling and inference accessible to a broader audience of developers and domain experts; and to develop new programmable systems that integrate symbolic, differentiable, and probabilistic reasoning techniques.

In simple terms, a probabilistic program is a traditional computer program that is augmented with the ability to generate and observe random variables drawn from various probability distributions. These operations form the basis of Monte Carlo simulation, randomized algorithms, and Bayesian inference. What are the principles for developing probabilistic programming systems, and how can we use them in practice? This course provides a first introduction to probabilistic programming from theoretical and applied perspectives.

The first part of the course covers foundational concepts in probabilistic language design and semantics. The second part covers algorithms and programmatic interfaces for modeling and inference with probabilistic programs. Applications will be given in areas that range from program analysis to data science.

Learning Outcomes

By the end of the course, students will acquire the skills to:
  • Assign rigorous mathematical semantics to probabilistic programs
  • Reason about the correctness and efficiency of probabilistic code
  • Implement and diagnose algorithms for probabilistic inference
  • Identify fundamental computational trade-offs between expressiveness, scalability, and automation
  • Use existing probabilistic programming systems for modeling and inference tasks
  • Design new probabilistic languages for domain-specific analyses

Prerequisite Knowledge

The course assumes a strong mathematical background in real analysis and probability theory, as well as extensive programing skills with exposure to functional programming. Students should be able to read and synthesize knowledge from diverse sources, including textbooks, software documentation, and academic research papers.

Undergraduates and master's students are welcome to attend the course. CMU courses that provide good preparation for 15-819 include 15-259 and 15-150 (at a minimum); as well as 15-312 and 10-708 (ideal) with 21-355/21-356.

Schedule

The course will be primarily taught using the blackboard along, with live programming demos and three interactive labs. Materials will be posted to EdStem.

Lecture 1 Mon Jan 12 Introduction
Lecture 2 Wed Jan 14 Probability Foundations
No Class Mon Jan 19 MLK Day
Lecture 3 Wed Jan 21 Inductive Definitions, Syntax, and Variables
Lecture 4 Mon Jan 26 Static Semantics of System T with Lists
Lecture 5 Wed Jan 28 Dynamic Semantics of System T with Lists
Lecture 6 Mon Feb 2 Normalization for System T
Lecture 7 Wed Feb 4 Sums, Products, and Base Types
Lecture 8 Mon Feb 9 Syntax and Static Semantics of SystemP
Lecture 9 Wed Feb 11 Dynamics of SystemP (Generation Semantics)
Lecture 10 Mon Feb 16 Dynamics of SystemP (Density Semantics)
Lecture 11 Wed Feb 18 Probabilistic Inference and Rejection Sampling
Lecture 12 Mon Feb 23 Importance Sampling: Motivation and Examples
Lecture 13 Wed Feb 25 Importance Sampling: General Theory
No Class Mon Mar 2 Spring Break
No Class Wed Mar 4 Spring Break
Lecture 14 Mon Mar 9 Markov Chain Monte Carlo
Lecture 15 Wed Mar 11 Gen Programming Lab 1
Lecture 16 Mon Mar 16 PoP Special Seminar [NSH 3305 at 11:00]
Lecture 17 Wed Mar 18 Gen Programming Lab 2
Lecture 18 Mon Mar 23 Gen Programming Lab 3
Lecture 19 Wed Mar 25 SystemP with Continuous Variables
Lecture 20 Mon Mar 30 The Measure Space of Traces
Lecture 21 Wed Apr 1 Conditioning on Probability Zero Events 1
Lecture 22 Mon Apr 6 Conditioning on Probability Zero Events 2
Lecture 23 Wed Apr 8 Compiling Probabilistic Programs 1
Lecture 24 Mon Apr 13 Compiling Probabilistic Programs 2
Lecture 25 Wed Apr 15 Final Project Work
Lecture 26 Mon Apr 20 Final Project Work
Lecture 27 Wed Apr 22 Student Presentations

Homework

Policies

Grading

Grade Components: The final grade is computed as follows:

  • Homework 1: 15%
  • Homework 2: 15%
  • Homework 3: 15%
  • Homework 4: 15%
  • Final Project: 40%

Attendance

Attendance in lecture is mandatory. If you expect to miss a lecture, please inform the instructors beforehand.

Lateness

All assignments have due dates indicated on the schedule overview and on the handout. In general, submitting assignments on time lets the course staff provide feedback in a more timely and efficient manner. Assignments build on each other, so timely submissions are crucial to your progress in the class. However, if you are not able to meet the submission deadline then notify the instructor in advance to make alternative arrangements.

Collaboration

We believe in collaboration. Discussing problems with others helps you learn better. If you collaborate with others, try to get "hints" rather than "answers." You should write up your actual homework on your own. If you use an outside source (web site, book, person, etc.), you must cite that source. At the top of your homework sheet, you must list all the people with whom you discussed any problem. Even if you were the one doing the helping, you should list the other person. Crediting discussion with others will not take away any credit from you, and will prevent us from assuming cheating if your answers look similar to those of someone else.

Wellbeing

If you are experiencing distress (mentally, physically, or emotionally) that is making it difficult for you to work and make progress in the class, we are here to help you. Please reach out to Feras or Jan so we can meet and discuss.