Fall 2021
The course will focus on multistep imperfectinformation games because
most realworld strategic settings are such games. Such games beget additional
issues beyond perfectinformation games like chess and Go, such as signaling,
deception, and understanding deception by others. There has been tremendous
progress in the AI community on solving such games since around 2003. This
course covers the fundamentals and the state of the art of solving such games.
Instructors 
Office 
Email 
GHC 9205 
sandholm AT cs 

GHC 9221 
gfarina AT cs 
· Lecture time: Tuesdays & Thursdays 11:50 AM  1:10 PM. No class on 10/14 or 11/25 due to University Holidays.
· Lecture location: GHC 4211
· Office hours: Flexible (email us to schedule)
The course will be lecture based. At the end of the course
there will be a few lectures of project presentations by students.
Readings will consist of a mixture of papers and course notes.
Students will complete a final project. It may be done individually or in groups of 23 students. It can be theoretical or experimental, or a combination. The students can pick their course project topic subject to instructor approval of the project proposal. The project can be related to the student’s normal research, but it cannot be the student’s normal research itself. We encourage creativity in the course projects.
Grading will be as follows:
·
50%
final project
·
40%
homework sets (there will be 23 homework sets that may include both
paperandpen questions and programming assignments)
·
10%
completion of readings, attendance, and participation in class discussions
Please turn in your project proposal via email by Thursday, Oct 7^{th}.
Please turn in your project writeup via email by Sunday, Dec 5^{th}.
Released 
Due 
Files 

Homework 1 
Sept. 24 
Oct. 7 (beginning of class) 

Homework 2 
Oct. 9 
Oct. 19 (beginning of class) 
Lecture 
Date 
Topic 
Reading(s) 
Lecture notes 
1 
9/7 
Introduction Course organization. Introduction to game theory. Game representations. Normal form, extensive form. Solution concepts. Properties of 2player 0sum games. 

2 
9/9 
Representation of strategies in treeform decision spaces Obvious (but often inappropriate for optimization) behavioral representation of a strategy. Sequenceform representation. Examples of sequenceform strategies, and computation of expected utilities given the sequenceform representation (multilinearity of expected utilities). Kuhn's theorem: relationship between normalform and sequenceform strategies. Bottomup decomposition of the sequenceform polytope. 

3 
9/14 
Regret minimization and hindsight rationality Phiregret minimization. Special cases: external regret minimization, internal regret minimization, swap regret. Externalregret dynamics lead to Nash equilibrium in 2player 0sum games, and to NFCCE in general multiplayer games. Internal regret minimization leads to Nash equilibrium in 2player 0sum games, and NFCE in general multiplayer games. For other choices of Phi transformations, we can arrive to EFCE and EFCCE. Special role of external regret minimization. Solution to convexconcave saddlepoint problems via regret minimization; applications to bilinear saddlepoint problems such as Nash equilibrium, optimal correlation, etc. 

4 
9/16 
Blackwell approachability and external regret minimization
for simplex domains Blackwell game approach and construction of regret matching (RM), RM+. 

5 
9/21 
Regret circuits and counterfactual regret minimization
(CFR) Treeplex case: regret circuits for Cartesian products and for convex hull. Construction of CFR and pseudocode; proof of correctness and convergence speed. 

6 
9/23 
Provably correct techniques for speeding up CFR algorithms Alternation. Reweighted updates of regrets and strategies, LCFR, DCFR. Dynamic pruning in imperfectinformation games. Warm starting from given strategies. 

7 
9/28 
Optimistic/predictive regret minimization via online
optimization Online projected gradient descent. Distancegenerating functions. Predictive followtheregularizedleader (FTRL), predictive online mirror descent (OMD), and RVU bounds. Notable instantiations, e.g., optimistic hedge/multiplicative weights update. Accelerated convergence to bilinear saddle points. Dilatable global entropy. 

8 
9/30 
Predictive Blackwell approachability Blackwell approachability on conic domains. Using regret minimization to solve a Blackwell approachability game. Abernethy et al.’s construction. Predictive Blackwell approachability. 

9 
10/5 
Predictive regret matching and predictive regret matching
plus Connections between followtheregularizedleader / online mirror descent and regret matching / regret matching plus. Construction of predictive regret matching and predictive regret matching plus. 

10 
10/7 
MonteCarlo CFR and offline optimization methods for twoplayer zerosum games Regret minimization with unbiased estimators of the utilities. Gametheoretic utility estimators (external sampling, outcome sampling). Offline optimization methods for twoplayer zerosum games. Accelerated firstorder saddlepoint solvers (excessive gap technique, mirror prox). Linear programming formulation of Nash equilibrium strategies. Payoff matrix sparsification technique. 


11 
10/12 
Game abstraction 1: Practical state of the art Lossless abstraction: GameShrink. Lossy state abstraction. Potentialaware, earthmoverdistance abstraction. 

12 
10/19 
Game abstraction 2 Abstraction algorithm for distributed equilibrium finding. Actionabstraction algorithms. Reverse mapping. Abstraction pathology. Lossy abstraction with solutionquality bounds. Application of the theory to game modeling. 

13 
10/21 
Stateoftheart for twoplayer nolimit Texas hold’em: Libratus Subgame solving in imperfectinformation games. Selfimprover. Time allowing: Deep CFR as an alternative to abstraction, and Supremus improvements to DeepStack. 

14 
10/26 
Stateoftheart for multiplayer nolimit Texas hold’em: Pluribus Depthlimited subgame solving. 

15 
10/28 
“Endgame”
solving without a blueprint strategy: ReBeL Guest lecture by Noam Brown. 

16 
11/2 
Simulatoraccess games 1: Stateoftheart for StarCraft: AlphaStar Guest lecture by
Wojtek Czarnecki from DeepMind, who worked on the multiagent aspects of AlphaStar. 

17 
11/4 
Simulatoraccess games 2: (Near)optimality certificates
and how to learn them Guest lecture by Brian Zhang. 

18 
11/9 
Opponent exploitation Different approaches to opponent exploitation. Hybrids with game theory. Safe opponent exploitation. 

19 
11/11 
Equilibrium refinements Sequential irrationality. Tremblinghand equilibrium refinements: quasiperfect equilibrium (QPE) and extensiveform perfect equilibrium (EFPE). Relationships among refinements. Computational complexity. Tremblinghand linear program formulation of QPE and EFPE. Scalable exact algorithms for QPE, and EFPE. 

20 
11/16 
Correlated strategies and team coordination Team maxmin equilibrium and TMECor; why the latter is often significantly better. Realization polytope: low dimensional but only the vertices are known and not the constraints; ways around that in practice. 

21 
11/18 
Course project presentations Presentations
from: David, Keegan, Kevin. 

22 
11/23 
Course project presentations Presentations
from: Neil, Wan, Alex. 

23 
11/30 
Course project presentations Presentations
from: Brian, Ioannis, Carmel. 

24 
12/2 
Course project presentations Presentations
from: Justin, Anita, Gokul. 
Professor 
Title 
Year 
University 
Christian Kroer 
2020 
Columbia 

Tuomas Sandholm 
2015 
CMU 

John P. Dickerson 
2018 
UMD 

Fei Fang 
2018 
CMU 

Constantinos Daskalakis 
2015 
MIT 

Yiling Chen 
Topics at the Interface between Computer Science and
Economics 
2016 
Harvard 
Vincent Conitzer 
Computational Microeconomics: Game Theory, Social Choice, and
Mechanism Design 
2016 
Duke 
Sanmay Das 
2016 
Wash U 

Ariel Procaccia 
2016 
CMU 

Milind Tambe 
2016 
USC 

Tim Roughgarden 
2013 
Stanford 