Fall 2025
The course will focus on multi-step imperfect-information games because
most real-world strategic settings are such games. Such games beget additional
issues beyond perfect-information games like chess and Go, such as signaling,
deception, and understanding deception by others. There has been tremendous
progress in the AI community on solving such games since around 2003. This
course covers the fundamentals and the state of the art of solving such games.
Instructors |
Office |
Email |
GHC 9205 |
sandholm AT cs |
|
GHC 6213 |
ianagnos AT cs |
The course will be lecture based. At the end of the course
there will be a few lectures of project presentations by students.
Readings will consist of a mixture of papers and course notes.
Students will complete a final project. It may be done
individually or in groups of 2-3 students. It can be theoretical or
experimental, or a combination. The students can pick their course project
topic subject to instructor approval of the project proposal. The project can
be related to the student’s normal research, but it cannot be the student’s
normal research itself. We encourage creativity in the course projects.
The deadline for the project proposal is on October 6.
Grading will be as follows:
·
50%
final project
·
20%
final exam
·
20%
homework sets (there will be 2 homework sets that may include both
paper-and-pen questions and programming assignments)
·
10%
completion of readings, attendance, and participation in class discussions
Released |
Due |
Files |
|
Homework 1 |
September 28 |
October 10 |
|
Homework 2 |
October 12 |
October 31 |
On both HW1 and HW2, we will allow late submissions with a penalty of 5 points per day, rounded up to the nearest whole day, up to a maximum of 5 days (-25 points), after which we will no longer allow submissions. This late policy does not apply to project deadlines (proposal or final report), which we will not accept late. Your project proposals should be seen as a chance for you to get feedback from us on your project idea—the earlier that happens, the more useful it is. The final report deadline (Dec 6) cannot be moved later due to logistical constraints (we need time to read them all!)
Lecture |
Date |
Topic |
Reading(s) |
Lecture slides |
1 |
M 8/25 |
Introduction Course organization. Introduction to game theory. Game representations. Normal form, extensive form. Solution concepts. Properties of 2-player 0-sum games. |
||
2 |
W 8/27 |
Perfect-information games 1 Tree search methods for two-player perfect-information games: minmax search, alpha-beta pruning, iterative deepening, quiescence search, singular extension, evaluation function learning, endgame databases, horizon problem, search depth pathology, chess. |
||
3 |
W
9/3 |
Perfect-information games 2 Monte Carlo Tree Search (MCTS). AlphaGo and AlphaGo Zero. |
||
4 |
M
9/8 |
Normal-form games LP formulation of zero-sum normal-form equilibrium computation. Fictitious play and follow the regularized leader (FTRL). Online learning and regret minimization. Mirror descent (MD). Multiplicative weights (MWU). Regret matching (RM) and RM+. Self-play and connection to game-theoretic equilibria. |
||
5 |
W 9/10 |
Extensive-form games 1 Extensive-form games. Behavioral representation of a strategy. Kuhn's theorem. Sequence-form representation and sequence-form LP. Construction of counterfactual regret minimization (CFR) and proof of correctness using regret circuits. |
[Section 5.2 of MAS] [Zinkevich
et al., NIPS 2007] |
|
6 |
M 9/15 |
Learning in general-sum games: correlated equilibria and Phi-regret Correlated and coarse correlated equilibrium. Phi-regret and its connections to types of correlated equilibrium. The GGM framework, and Blum-Mansour as a special case. |
||
7 |
W 9/17 |
Algorithms for minimizing Phi-regret: nonlinear deviations and ellipsoid Achieving swap regret among low-degree deviations. Linear-swap correlated equilibria as a special case. Nonlinear deviations. Expected fixed points. Ellipsoid against hope. |
||
8 |
M 9/22 |
Faster no-regret learning dynamics and last-iterate convergence Near-optimal regret using optimism. Connections to last-iterate convergence. |
||
9 |
W 9/24 |
Extensive-form games 2: CFR
speedups Alternation. Reweighted updates of regrets and strategies, LCFR, DCFR. Dynamic pruning in imperfect-information games. Warm starting from given strategies. Optimistic regret minimization algorithms. |
[Brown
& Sandholm, ICML 2017] |
|
10 |
M 9/29 |
Game abstraction 1 Practical state of the art. Lossless abstraction: GameShrink. Lossy state abstraction. Potential-aware, earth-mover-distance abstraction. |
||
11 |
W 10/1 |
Game abstraction 2 Abstraction algorithm for distributed equilibrium finding. Action-abstraction algorithms. Reverse mapping. Abstraction pathology. Lossy abstraction with solution-quality bounds. Application of the theory to game modeling. |
||
12 |
M 10/6 |
State-of-the-art for two-player no-limit Texas hold'em: Libratus (I) History of game theory and AI for poker, rules of Texas hold'em, man-machine match setup. |
||
12 |
W 10/8 |
State-of-the-art for two-player no-limit Texas hold'em: Libratus (II) Subgame solving in imperfect-information games. |
||
12 |
M 10/20 |
State-of-the-art for two-player no-limit Texas hold'em: Libratus (III) Solving multiple subgames, self-improver, transforming poker knowledge. |
||
13 |
W 10/22 |
Knowledge-limited
subgame solving (KLSS) Limits of subgame solving with common knowledge. Subgame solving without common knowledge. Superhuman Fog-of-War chess. Safe KLSS. |
||
14 |
M 10/27 |
State-of-the-art for multi-player no-limit Texas hold'em: Pluribus Depth-limited subgame solving. |
||
15 |
W 10/29 |
Deep learning
in game solving Monte Carlo CFR (MCCFR), and sampling approaches. Deep CFR as an alternative to abstraction. DREAM, ESCHER. |
||
18 |
M 11/3 |
Team games. Guest lecture by Brian Zhang. Team maxmin equilibrium and TMECor; why the latter is often significantly better. Realization polytope: low dimensional but hard to represent; ways around that in practice. Team DAG and Team PSRO. |
[Farina
et al., NeurIPS 2018] |
|
19 |
W 11/5 |
Automated (general!) mechanism design. Guest lecture by Brian Zhang. Stackelberg equilibria, correlated equilibria, mechanism design, and information design. Optimal (revenue-maximizing) auctions. |
||
17 |
M 11/10 |
Deep learning
in game solving 3. Guest lecture by Stephen McAleer. Game-theoretic LLM training. DeepNash for expert-level Stratego: State of the art for Stratego. Magnetic Mirror Descent. |
[Perolat et al.,
Science 2022] |
|
16 |
W 11/12 |
Double oracle-based methods Double oracle (DO), policy space response oracles (PSRO), extensive-form double oracle (XDO), anytime PSRO, self-play PSRO, diversity in PSRO. AlphaStar and OpenAI Five. |
[Heinrich and Silver, Arxiv 2016] [Vinyals
et al., Nature 2019] [Lanctot et al., NeurIPS 2017] |
|
21 |
M 11/17 |
Final exam |
|
|
22 |
W 11/19 |
Course project presentations |
||
24 |
M 11/24 |
Course project presentations |
||
25 |
M 12/1 |
Course project presentations |
||
26 |
W 12/3 |
Course project presentations |
Professor |
Title |
Year |
University |
Tuomas Sandholm and Brian Hu Zhang |
2024 |
CMU |
|
Tuomas Sandholm and Stephen McAleer |
2023 |
CMU |
|
Tuomas Sandholm and Gabriele Farina |
2021 |
CMU |
|
Vince Conitzer |
2024 |
CMU |
|
Vince Conitzer, Caspar Oesterhald, Tuomas Sandholm |
2022 |
CMU |
|
Tuomas Sandholm |
2015 |
CMU |
|
Gabriele Farina and Constantinos Daskalakis |
2023 |
MIT |
|
Christian Kroer |
2020 |
Columbia |
|
John P. Dickerson |
2018 |
UMD |
|
Fei Fang |
2018 |
CMU |
|
Yiling Chen |
Topics at the Interface between Computer Science and
Economics |
2016 |
Harvard |
Vincent Conitzer |
Computational Microeconomics: Game Theory, Social Choice, and
Mechanism Design |
2016 |
Duke |
Ariel Procaccia |
2016 |
CMU |
|
Tim Roughgarden |
2013 |
Stanford |