Stochastic Regret Minimization in Extensive-Form Games

Gabriele Farina, Christian Kroer, Tuomas Sandholm


Monte-Carlo counterfactual regret minimization (MCCFR) is the state-of-the-art algorithm for solving sequential games that are too large for full tree traversals. It works by using gradient estimates that can be computed via sampling. However, stochastic methods for sequential games have not been investigated extensively beyond MCCFR. In this paper we develop a new framework for developing stochastic regret minimization methods. This framework allows us to use any regret-minimization algorithm, coupled with any gradient estimator. The MCCFR algorithm can be analyzed as a special case of our framework, and this analysis leads to significantly stronger theoretical guarantees on convergence, while simultaneously yielding a simplified proof. Our framework allows us to instantiate several new stochastic methods for solving sequential games. We show extensive experiments on five games, where some variants of our methods outperform MCCFR.

Bibtex entry

@inproceedings{Farina20:Stochastic, title={Stochastic Regret Minimization in Extensive-Form Games}, author={Farina, Gabriele and Kroer, Christian and Sandholm, Tuomas}, booktitle={International Conference on Machine Learning}, year={2020} }


Paper PDF

Typo or question?

Get in touch!
gfarina AT


Venue: ICML 2020
Topic: Decision Making, Optimization, and Computational Game Theory