Computational Game Solving

Fall 2021

Focus

The course will focus on multi-step imperfect-information games because most real-world strategic settings are such games. Such games beget additional issues beyond perfect-information games like chess and Go, such as signaling, deception, and understanding deception by others. There has been tremendous progress in the AI community on solving such games since around 2003. This course covers the fundamentals and the state of the art of solving such games.

Course logistics

Instructors

Office

Email

Prof. Tuomas Sandholm

GHC 9205

sandholm AT cs

Gabriele Farina

GHC 9221

gfarina AT cs

 

·       Lecture time: Tuesdays & Thursdays 11:50 AM - 1:10 PM. No class on 10/14 or 11/25 due to University Holidays.

·       Lecture location: GHC 4211

·       Office hours: Flexible (email us to schedule)

Course structure and evaluation

The course will be lecture based. At the end of the course there will be a few lectures of project presentations by students.

Readings will consist of a mixture of papers and course notes.

Students will complete a final project. It may be done individually or in groups of 2-3 students. It can be theoretical or experimental, or a combination. The students can pick their course project topic subject to instructor approval of the project proposal. The project can be related to the student’s normal research, but it cannot be the student’s normal research itself. We encourage creativity in the course projects.

Grading will be as follows:

·       50% final project

·       40% homework sets (there will be 2-3 homework sets that may include both paper-and-pen questions and programming assignments)

·       10% completion of readings, attendance, and participation in class discussions

Please turn in your project proposal via email by Thursday, Oct 7th.

Please turn in your project writeup via email by Sunday, Dec 5th.

Homework sets

Released

Due

Files

Homework 1

Sept. 24

Oct. 7 (beginning of class)

hw1.pdf    hw1.zip    changelog

Homework 2

Oct. 9

Oct. 19 (beginning of class)

hw2.pdf    hw2.zip    changelog

 

Schedule (subject to change)

Lecture

Date

Topic

Reading(s)

Lecture notes

1

9/7

Introduction

Course organization. Introduction to game theory. Game representations. Normal form, extensive form. Solution concepts. Properties of 2-player 0-sum games.

Slides

2

9/9

Representation of strategies in tree-form decision spaces

Obvious (but often inappropriate for optimization) behavioral representation of a strategy. Sequence-form representation. Examples of sequence-form strategies, and computation of expected utilities given the sequence-form representation (multilinearity of expected utilities). Kuhn's theorem: relationship between normal-form and sequence-form strategies. Bottom-up decomposition of the sequence-form polytope.

Lecture notes

3

9/14

Regret minimization and hindsight rationality

Phi-regret minimization. Special cases: external regret minimization, internal regret minimization, swap regret. External-regret dynamics lead to Nash equilibrium in 2-player 0-sum games, and to NFCCE in general multiplayer games. Internal regret minimization leads to Nash equilibrium in 2-player 0-sum games, and NFCE in general multiplayer games. For other choices of Phi transformations, we can arrive to EFCE and EFCCE. Special role of external regret minimization. Solution to convex-concave saddle-point problems via regret minimization; applications to bilinear saddle-point problems such as Nash equilibrium, optimal correlation, etc.

[Gordon et al., ICML’08]

Lecture notes

4

9/16

Blackwell approachability and external regret minimization for simplex domains

Blackwell game approach and construction of regret matching (RM), RM+.

Lecture notes

5

9/21

Regret circuits and counterfactual regret minimization (CFR)

Treeplex case: regret circuits for Cartesian products and for convex hull. Construction of CFR and pseudocode; proof of correctness and convergence speed.

Lecture notes

6

9/23

Provably correct techniques for speeding up CFR algorithms

Alternation. Reweighted updates of regrets and strategies, LCFR, DCFR. Dynamic pruning in imperfect-information games. Warm starting from given strategies.

[Brown & Sandholm, ICML’17]
[Brown & Sandholm, AAAI’19]

Slides

7

9/28

Optimistic/predictive regret minimization via online optimization

Online projected gradient descent. Distance-generating functions. Predictive follow-the-regularized-leader (FTRL), predictive online mirror descent (OMD), and RVU bounds. Notable instantiations, e.g., optimistic hedge/multiplicative weights update. Accelerated convergence to bilinear saddle points. Dilatable global entropy.

[Syrgkanis et al., NIPS’15]

Lecture notes

8

9/30

Predictive Blackwell approachability

Blackwell approachability on conic domains. Using regret minimization to solve a Blackwell approachability game. Abernethy et al.’s construction. Predictive Blackwell approachability.

[Abernethy et al., COLT’11]

Lecture notes

9

10/5

Predictive regret matching and predictive regret matching plus

Connections between follow-the-regularized-leader / online mirror descent and regret matching / regret matching plus. Construction of predictive regret matching and predictive regret matching plus.

Lecture notes

10

10/7

Monte-Carlo CFR and offline optimization methods for two-player zero-sum games

Regret minimization with unbiased estimators of the utilities. Game-theoretic utility estimators (external sampling, outcome sampling). Offline optimization methods for two-player zero-sum games. Accelerated first-order saddle-point solvers (excessive gap technique, mirror prox). Linear programming formulation of Nash equilibrium strategies. Payoff matrix sparsification technique.

 

Lecture notes

11

10/12

Game abstraction 1: Practical state of the art

Lossless abstraction: GameShrink. Lossy state abstraction. Potential-aware, earth-mover-distance abstraction.

[Brown et al., AAMAS’15]

Slides

12

10/19

Game abstraction 2

Abstraction algorithm for distributed equilibrium finding. Action-abstraction algorithms. Reverse mapping. Abstraction pathology. Lossy abstraction with solution-quality bounds. Application of the theory to game modeling.

[Kroer & Sandholm, NeurIPS’18]

Slides

13

10/21

State-of-the-art for two-player no-limit Texas hold’em: Libratus

Subgame solving in imperfect-information games. Self-improver. Time allowing: Deep CFR as an alternative to abstraction, and Supremus improvements to DeepStack.

[Brown & Sandholm, Science’18]

Slides   Videos

14

10/26

State-of-the-art for multi-player no-limit Texas hold’em: Pluribus

Depth-limited subgame solving.

[Brown & Sandholm, Science’19]

Slides

15

10/28

“Endgame” solving without a blueprint strategy: ReBeL

Guest lecture by Noam Brown.

[Brown et al., NeurIPS’20]

Slides   Youtube

16

11/2

Simulator-access games 1: State-of-the-art for StarCraft: AlphaStar

Guest lecture by Wojtek Czarnecki from DeepMind, who worked on the multiagent aspects of AlphaStar.

[Vinyals et al., Nature’19]

Google Slides   pdf

17

11/4

Simulator-access games 2: (Near)optimality certificates and how to learn them

Guest lecture by Brian Zhang.

[Zhang & Sandholm, AAAI’21]

Slides

18

11/9

Opponent exploitation

Different approaches to opponent exploitation. Hybrids with game theory. Safe opponent exploitation.

[Ganzfried & Sandholm, AAMAS’11]

Slides

19

11/11

Equilibrium refinements

Sequential irrationality. Trembling-hand equilibrium refinements: quasi-perfect equilibrium (QPE) and extensive-form perfect equilibrium (EFPE). Relationships among refinements. Computational complexity. Trembling-hand linear program formulation of QPE and EFPE. Scalable exact algorithms for QPE, and EFPE.

[Farina & Sandholm, NeurIPS’18]

Lecture notes

20

11/16

Correlated strategies and team coordination

Team maxmin equilibrium and TMECor; why the latter is often significantly better. Realization polytope: low dimensional but only the vertices are known and not the constraints; ways around that in practice.

[Farina et al., NeurIPS’18]
[Zhang & Sandholm, arXiv’21]

Lecture notes

21

11/18

Course project presentations

Presentations from: David, Keegan, Kevin.

Note: this lecture will start at 12.30pm and end at 1.50pm.

22

11/23

Course project presentations

Presentations from: Neil, Wan, Alex.

23

11/30

Course project presentations

Presentations from: Brian, Ioannis, Carmel.

24

12/2

Course project presentations

Presentations from: Justin, Anita, Gokul.

 

Related courses

Professor

Title

Year

University

Christian Kroer

Economics, AI, and Optimization

2020

Columbia

Tuomas Sandholm

Foundations of Electronic Marketplaces

2015

CMU

John P. Dickerson

Applied Mechanism Design for Social Good

2018

UMD

Fei Fang

Artificial Intelligence Methods for Social Good

2018

CMU

Constantinos Daskalakis

Games, Decision, and Computation

2015

MIT

Yiling Chen

Topics at the Interface between Computer Science and Economics

2016

Harvard

Vincent Conitzer

Computational Microeconomics: Game Theory, Social Choice, and Mechanism Design

2016

Duke

Sanmay Das

Multi-Agent Systems

2016

Wash U

Ariel Procaccia

Truth, Justice, and Algorithms

2016

CMU

Milind Tambe

Security and Game Theory

2016

USC

Tim Roughgarden

Algorithmic Game Theory

2013

Stanford