Computational Game Solving

Fall 2021

Focus

The course will focus on multi-step imperfect-information games because most real-world strategic settings are such games. Such games beget additional issues beyond perfect-information games like chess and Go, such as signaling, deception, and understanding deception by others. There has been tremendous progress in the AI community on solving such games since around 2003. This course covers the fundamentals and the state of the art of solving such games.

Course logistics

Instructors	Office	Email
Prof. Tuomas Sandholm	GHC 9205	sandholm AT cs
Gabriele Farina	GHC 9221	gfarina AT cs

· Lecture time: Tuesdays & Thursdays 11:50 AM - 1:10 PM. No class on 10/14 or 11/25 due to University Holidays.

· Lecture location: GHC 4211

· Office hours: Flexible (email us to schedule)

Course structure and evaluation

The course will be lecture based. At the end of the course there will be a few lectures of project presentations by students.

Readings will consist of a mixture of papers and course notes.

Students will complete a final project. It may be done individually or in groups of 2-3 students. It can be theoretical or experimental, or a combination. The students can pick their course project topic subject to instructor approval of the project proposal. The project can be related to the student’s normal research, but it cannot be the student’s normal research itself. We encourage creativity in the course projects.

Grading will be as follows:

· 50% final project

· 40% homework sets (there will be 2-3 homework sets that may include both paper-and-pen questions and programming assignments)

· 10% completion of readings, attendance, and participation in class discussions

Please turn in your project proposal via email by Thursday, Oct 7^th.

Please turn in your project writeup via email by Sunday, Dec 5^th.

Homework sets

	Released	Due	Files
Homework 1	Sept. 24	Oct. 7 (beginning of class)	hw1.pdf hw1.zip changelog
Homework 2	Oct. 9	Oct. 19 (beginning of class)	hw2.pdf hw2.zip changelog

Schedule (subject to change)

Lecture	Date	Topic	Reading(s)	Lecture notes
1	9/7	Introduction Course organization. Introduction to game theory. Game representations. Normal form, extensive form. Solution concepts. Properties of 2-player 0-sum games.		Slides
2	9/9	Representation of strategies in tree-form decision spaces Obvious (but often inappropriate for optimization) behavioral representation of a strategy. Sequence-form representation. Examples of sequence-form strategies, and computation of expected utilities given the sequence-form representation (multilinearity of expected utilities). Kuhn's theorem: relationship between normal-form and sequence-form strategies. Bottom-up decomposition of the sequence-form polytope.		Lecture notes
3	9/14	Regret minimization and hindsight rationality Phi-regret minimization. Special cases: external regret minimization, internal regret minimization, swap regret. External-regret dynamics lead to Nash equilibrium in 2-player 0-sum games, and to NFCCE in general multiplayer games. Internal regret minimization leads to Nash equilibrium in 2-player 0-sum games, and NFCE in general multiplayer games. For other choices of Phi transformations, we can arrive to EFCE and EFCCE. Special role of external regret minimization. Solution to convex-concave saddle-point problems via regret minimization; applications to bilinear saddle-point problems such as Nash equilibrium, optimal correlation, etc.	[Gordon et al., ICML’08]	Lecture notes
4	9/16	Blackwell approachability and external regret minimization for simplex domains Blackwell game approach and construction of regret matching (RM), RM+.		Lecture notes
5	9/21	Regret circuits and counterfactual regret minimization (CFR) Treeplex case: regret circuits for Cartesian products and for convex hull. Construction of CFR and pseudocode; proof of correctness and convergence speed.		Lecture notes
6	9/23	Provably correct techniques for speeding up CFR algorithms Alternation. Reweighted updates of regrets and strategies, LCFR, DCFR. Dynamic pruning in imperfect-information games. Warm starting from given strategies.	[Brown & Sandholm, ICML’17] [Brown & Sandholm, AAAI’19]	Slides
7	9/28	Optimistic/predictive regret minimization via online optimization Online projected gradient descent. Distance-generating functions. Predictive follow-the-regularized-leader (FTRL), predictive online mirror descent (OMD), and RVU bounds. Notable instantiations, e.g., optimistic hedge/multiplicative weights update. Accelerated convergence to bilinear saddle points. Dilatable global entropy.	[Syrgkanis et al., NIPS’15]	Lecture notes
8	9/30	Predictive Blackwell approachability Blackwell approachability on conic domains. Using regret minimization to solve a Blackwell approachability game. Abernethy et al.’s construction. Predictive Blackwell approachability.	[Abernethy et al., COLT’11]	Lecture notes
9	10/5	Predictive regret matching and predictive regret matching plus Connections between follow-the-regularized-leader / online mirror descent and regret matching / regret matching plus. Construction of predictive regret matching and predictive regret matching plus.		Lecture notes
10	10/7	Monte-Carlo CFR and offline optimization methods for two-player zero-sum games Regret minimization with unbiased estimators of the utilities. Game-theoretic utility estimators (external sampling, outcome sampling). Offline optimization methods for two-player zero-sum games. Accelerated first-order saddle-point solvers (excessive gap technique, mirror prox). Linear programming formulation of Nash equilibrium strategies. Payoff matrix sparsification technique.		Lecture notes
11	10/12	Game abstraction 1: Practical state of the art Lossless abstraction: GameShrink. Lossy state abstraction. Potential-aware, earth-mover-distance abstraction.	[Brown et al., AAMAS’15]	Slides
12	10/19	Game abstraction 2 Abstraction algorithm for distributed equilibrium finding. Action-abstraction algorithms. Reverse mapping. Abstraction pathology. Lossy abstraction with solution-quality bounds. Application of the theory to game modeling.	[Kroer & Sandholm, NeurIPS’18]	Slides
13	10/21	State-of-the-art for two-player no-limit Texas hold’em: Libratus Subgame solving in imperfect-information games. Self-improver. Time allowing: Deep CFR as an alternative to abstraction, and Supremus improvements to DeepStack.	[Brown & Sandholm, Science’18]	Slides Videos
14	10/26	State-of-the-art for multi-player no-limit Texas hold’em: Pluribus Depth-limited subgame solving.	[Brown & Sandholm, Science’19]	Slides
15	10/28	“Endgame” solving without a blueprint strategy: ReBeL Guest lecture by Noam Brown.	[Brown et al., NeurIPS’20]	Slides Youtube
16	11/2	Simulator-access games 1: State-of-the-art for StarCraft: AlphaStar Guest lecture by Wojtek Czarnecki from DeepMind, who worked on the multiagent aspects of AlphaStar.	[Vinyals et al., Nature’19]	Google Slides pdf
17	11/4	Simulator-access games 2: (Near)optimality certificates and how to learn them Guest lecture by Brian Zhang.	[Zhang & Sandholm, AAAI’21]	Slides
18	11/9	Opponent exploitation Different approaches to opponent exploitation. Hybrids with game theory. Safe opponent exploitation.	[Ganzfried & Sandholm, AAMAS’11]	Slides
19	11/11	Equilibrium refinements Sequential irrationality. Trembling-hand equilibrium refinements: quasi-perfect equilibrium (QPE) and extensive-form perfect equilibrium (EFPE). Relationships among refinements. Computational complexity. Trembling-hand linear program formulation of QPE and EFPE. Scalable exact algorithms for QPE, and EFPE.	[Farina & Sandholm, NeurIPS’18]	Lecture notes
20	11/16	Correlated strategies and team coordination Team maxmin equilibrium and TMECor; why the latter is often significantly better. Realization polytope: low dimensional but only the vertices are known and not the constraints; ways around that in practice.	[Farina et al., NeurIPS’18] [Zhang & Sandholm, arXiv’21]	Lecture notes
21	11/18	Course project presentations Presentations from: David, Keegan, Kevin. Note: this lecture will start at 12.30pm and end at 1.50pm.
22	11/23	Course project presentations Presentations from: Neil, Wan, Alex.
23	11/30	Course project presentations Presentations from: Brian, Ioannis, Carmel.
24	12/2	Course project presentations Presentations from: Justin, Anita, Gokul.

Related courses

Professor	Title	Year	University
Christian Kroer	Economics, AI, and Optimization	2020	Columbia
Tuomas Sandholm	Foundations of Electronic Marketplaces	2015	CMU
John P. Dickerson	Applied Mechanism Design for Social Good	2018	UMD
Fei Fang	Artificial Intelligence Methods for Social Good	2018	CMU
Constantinos Daskalakis	Games, Decision, and Computation	2015	MIT
Yiling Chen	Topics at the Interface between Computer Science and Economics	2016	Harvard
Vincent Conitzer	Computational Microeconomics: Game Theory, Social Choice, and Mechanism Design	2016	Duke
Sanmay Das	Multi-Agent Systems	2016	Wash U
Ariel Procaccia	Truth, Justice, and Algorithms	2016	CMU
Milind Tambe	Security and Game Theory	2016	USC
Tim Roughgarden	Algorithmic Game Theory	2013	Stanford