Uncoupled Learning Dynamics with $O(\log T)$ Swap Regret in Multiplayer Games

Ioannis Anagnostides, Gabriele Farina, Christian Kroer, Chung-Wei Lee, Haipeng Luo, Tuomas Sandholm

Abstract

In this paper we establish efficient and uncoupled learning dynamics so that, when employed by all players in a general-sum multiplayer game, the swap regret of each player after T repetitions of the game is bounded by $O(\log T)$, improving over the prior best bounds of $O(\log^4 T)$. At the same time, we guarantee optimal $O(\sqrt T)$ swap regret in the adversarial regime as well. To obtain these results, our primary contribution is to show that when all players follow our dynamics with a time-invariant learning rate, the second-order path lengths of the dynamics up to time $T$ are bounded by $O(\log T)$, a fundamental property which could have further implications beyond near-optimally bounding the (swap) regret. Our proposed learning dynamics combine in a novel way optimistic regularized learning with the use of self-concordant barriers. Further, our analysis is remarkably simple, bypassing the cumbersome framework of higher-order smoothness recently developed by Daskalakis, Fishelson, and Golowich (NeurIPS'21).

Bibtex entry

@inproceedings{Anagnostides22:Uncoupled, title={Uncoupled Learning Dynamics with $O(\log T)$ Swap Regret in Multiplayer Games}, author={Ioannis Anagnostides and Gabriele Farina and Christian Kroer and Chung-Wei Lee and Haipeng Luo and Tuomas Sandholm}, booktitle={Neural Information Processing Systems (NeurIPS)}, year={2022} }

Download

Paper PDF

Typo or question?

Get in touch!
gfarina AT cs.cmu.edu

Metadata

Venue: NeurIPS 2022
Topic: Decision Making, Optimization, and Computational Game Theory