Computational Game Solving

Fall 2025

Focus

The course will focus on multi-step imperfect-information games because most real-world strategic settings are such games. Such games beget additional issues beyond perfect-information games like chess and Go, such as signaling, deception, and understanding deception by others. There has been tremendous progress in the AI community on solving such games since around 2003. This course covers the fundamentals and the state of the art of solving such games.

Course logistics

Instructors

Office

Email

Prof. Tuomas Sandholm

GHC 9205

sandholm AT cs

Ioannis Anagnostides

GHC 6213

ianagnos AT cs

 

Course structure and evaluation

The course will be lecture based. At the end of the course there will be a few lectures of project presentations by students.

Readings will consist of a mixture of papers and course notes.

Students will complete a final project. It may be done individually or in groups of 2-3 students. It can be theoretical or experimental, or a combination. The students can pick their course project topic subject to instructor approval of the project proposal. The project can be related to the student’s normal research, but it cannot be the student’s normal research itself. We encourage creativity in the course projects.

The deadline for the project proposal is on October 6.

Grading will be as follows:

·       50% final project

·       20% final exam

·       20% homework sets (there will be 2 homework sets that may include both paper-and-pen questions and programming assignments)

·       10% completion of readings, attendance, and participation in class discussions

Homework sets

Released

Due

Files

Homework 1

September 28

October 10

HW1.zip

Homework 2

October 12

October 31

HW2.zip

 

On both HW1 and HW2, we will allow late submissions with a penalty of 5 points per day, rounded up to the nearest whole day, up to a maximum of 5 days (-25 points), after which we will no longer allow submissions. This late policy does not apply to project deadlines (proposal or final report), which we will not accept late. Your project proposals should be seen as a chance for you to get feedback from us on your project idea—the earlier that happens, the more useful it is. The final report deadline (Dec 6) cannot be moved later due to logistical constraints (we need time to read them all!)

Schedule (subject to change)

Lecture

Date

Topic

Reading(s)

Lecture slides

1

M 8/25

Introduction

Course organization. Introduction to game theory. Game representations. Normal form, extensive form. Solution concepts. Properties of 2-player 0-sum games.

Slides

2

W 8/27

Perfect-information games 1

Tree search methods for two-player perfect-information games: minmax search, alpha-beta pruning, iterative deepening, quiescence search, singular extension, evaluation function learning, endgame databases, horizon problem, search depth pathology, chess.

Slides

3

W 9/3

Perfect-information games 2

Monte Carlo Tree Search (MCTS). AlphaGo and AlphaGo Zero.

[Silver et al., Nature 2017]

Slides

4

M 9/8

Normal-form games

LP formulation of zero-sum normal-form equilibrium computation. Fictitious play and follow the regularized leader (FTRL). Online learning and regret minimization. Mirror descent (MD). Multiplicative weights (MWU). Regret matching (RM) and RM+. Self-play and connection to game-theoretic equilibria.

[Section 4.1 and 4.6 of MAS]
[Section 4.2-3 of AGT]

Lecture notes

Slides

5

W 9/10

Extensive-form games 1

Extensive-form games. Behavioral representation of a strategy. Kuhn's theorem. Sequence-form representation and sequence-form LP. Construction of counterfactual regret minimization (CFR) and proof of correctness using regret circuits.

[Section 5.2 of MAS]
[Section 3.10-11 of AGT]

[Zinkevich et al., NIPS 2007]
[Farina et al., ICML 2019]
[F21 Lecture Notes]

Lecture notes

Slides

6

M 9/15

Learning in general-sum games: correlated equilibria and Phi-regret

Correlated and coarse correlated equilibrium. Phi-regret and its connections to types of correlated equilibrium. The GGM framework, and Blum-Mansour as a special case.

[Blum & Mansour, JMLR 2007]

[Gordon, Greenwald, and Marks, ICML 2008]

Lecture notes

Slides

7

W 9/17

Algorithms for minimizing Phi-regret: nonlinear deviations and ellipsoid

Achieving swap regret among low-degree deviations. Linear-swap correlated equilibria as a special case. Nonlinear deviations. Expected fixed points. Ellipsoid against hope.

[Zhang et al., NeurIPS 2024]

[Farina and Pipis, NeurIPS 2024]

[Daskalakis et al., STOC 2025]

Lecture notes

Slides

8

M 9/22

Faster no-regret learning dynamics and last-iterate convergence

Near-optimal regret using optimism. Connections to last-iterate convergence.

[Anagnostides et al., ICML 2022]

[Anagnostides et al., NeurIPS 2022]

Lecture notes

Slides

9

W 9/24

Extensive-form games 2: CFR speedups

Alternation. Reweighted updates of regrets and strategies, LCFR, DCFR. Dynamic pruning in imperfect-information games. Warm starting from given strategies. Optimistic regret minimization algorithms.

[Brown & Sandholm, ICML 2017]
[Brown & Sandholm, AAAI 2019]
[Farina, Kroer, and Sandholm AAAI 2021]

Slides

10

M 9/29

Game abstraction 1

Practical state of the art. Lossless abstraction: GameShrink. Lossy state abstraction. Potential-aware, earth-mover-distance abstraction.

[Brown et al., AAMAS 2015]

Slides

11

W 10/1

Game abstraction 2

Abstraction algorithm for distributed equilibrium finding. Action-abstraction algorithms. Reverse mapping. Abstraction pathology. Lossy abstraction with solution-quality bounds. Application of the theory to game modeling.

[Kroer & Sandholm, NeurIPS 2018]

Slides

12

M 10/6

State-of-the-art for two-player no-limit Texas hold'em: Libratus (I)

History of game theory and AI for poker, rules of Texas hold'em, man-machine match setup.

[Brown & Sandholm, Science 2018]

Slides

12

W 10/8

State-of-the-art for two-player no-limit Texas hold'em: Libratus (II)

Subgame solving in imperfect-information games.

[Brown & Sandholm, Science 2018]

Slides

12

M 10/20

State-of-the-art for two-player no-limit Texas hold'em: Libratus (III)

Solving multiple subgames, self-improver, transforming poker knowledge.

[Brown & Sandholm, Science 2018]

Slides

13

W 10/22

Knowledge-limited subgame solving (KLSS)

Limits of subgame solving with common knowledge. Subgame solving without common knowledge. Superhuman Fog-of-War chess. Safe KLSS.

[Zhang and Sandholm, NeurIPS 2021]
[Liu et al., ICML 2023]

Slides

14

M 10/27

State-of-the-art for multi-player no-limit Texas hold'em: Pluribus

Depth-limited subgame solving.

[Brown & Sandholm, Science 2019]

15

W 10/29

Deep learning in game solving

Monte Carlo CFR (MCCFR), and sampling approaches. Deep CFR as an alternative to abstraction. DREAM, ESCHER.

[Lanctot et al., NeurIPS 2009]

[Brown et al., ICML 2019]
[McAleer et al., ICLR 2023]

18

M 11/3

Team games. Guest lecture by Brian Zhang.

Team maxmin equilibrium and TMECor; why the latter is often significantly better. Realization polytope: low dimensional but hard to represent; ways around that in practice. Team DAG and Team PSRO.

[Farina et al., NeurIPS 2018]
[Zhang, Farina, and Sandholm, ICML 2023]

[McAleer et al., NeurIPS 2023]

[Zhang et al., EC 2024]

19

W 11/5

Automated (general!) mechanism design. Guest lecture by Brian Zhang.

Stackelberg equilibria, correlated equilibria, mechanism design, and information design. Optimal (revenue-maximizing) auctions.

[Zhang et al., NeurIPS 2023]

[Kamenica and Gentzkow AER]

17

M 11/10

Deep learning in game solving 3. Guest lecture by Stephen McAleer.

Game-theoretic LLM training. DeepNash for expert-level Stratego: State of the art for Stratego. Magnetic Mirror Descent.

[Perolat et al., Science 2022]
[Perolat et al., ICML 2021]
[Sokota et al., ICLR 2023]

16

W 11/12

Double oracle-based methods

Double oracle (DO), policy space response oracles (PSRO), extensive-form double oracle (XDO), anytime PSRO, self-play PSRO, diversity in PSRO. AlphaStar and OpenAI Five.

[Heinrich and Silver, Arxiv 2016]

[Vinyals et al., Nature 2019]
[OpenAI et al., 2019]

[Lanctot et al., NeurIPS 2017]
[McAleer et al., NeurIPS 2021]
[McAleer et al., ICLR 2024]

21

M 11/17

Final exam

 

22

W 11/19

Course project presentations

24

M 11/24

Course project presentations

25

M 12/1

Course project presentations

26

W 12/3

Course project presentations

 

Related courses

Professor

Title

Year

University

Tuomas Sandholm and Brian Hu Zhang

Computational Game Solving

2024

CMU

Tuomas Sandholm and Stephen McAleer

Computational Game Solving

2023

CMU

Tuomas Sandholm and Gabriele Farina

Computational Game Solving

2021

CMU

Vince Conitzer

Foundations of Cooperative AI

2024

CMU

Vince Conitzer, Caspar Oesterhald, Tuomas Sandholm

Foundations of Cooperative AI

2022

CMU

Tuomas Sandholm

Foundations of Electronic Marketplaces

2015

CMU

Gabriele Farina and Constantinos Daskalakis

Topics in Multiagent Learning

2023

MIT

Christian Kroer

Economics, AI, and Optimization

2020

Columbia

John P. Dickerson

Applied Mechanism Design for Social Good

2018

UMD

Fei Fang

Artificial Intelligence Methods for Social Good

2018

CMU

Yiling Chen

Topics at the Interface between Computer Science and Economics

2016

Harvard

Vincent Conitzer

Computational Microeconomics: Game Theory, Social Choice, and Mechanism Design

2016

Duke

Ariel Procaccia

Truth, Justice, and Algorithms

2016

CMU

Tim Roughgarden

Algorithmic Game Theory

2013

Stanford