
15-888 Computational Game Solving (Fall 2021) Tue, Oct 5th 2021
Lecture 9
Predictive regret matching and regret matching+
Instructor: Gabriele Farina
1 Predictive Blackwell approachability for simplex domains
In Lecture 8 we saw that a Blackwell approachability game with a conic target set can be solved by means
of an external regret minimization algorithm whose domain is the polar of the cone, using a construction
by Abernethy et al. [2011].
In this lecture, we will specialize that algorithm in the particular Blackwell game Γ = (∆n, n, u, S := n
0)
we introduced in Lecture 4, where the bilinear Blackwell utility of the game was defined as
u(x, ) := − ⟨, x 1 n.
As we showed back then, any solution to Γ—that is, any algorithm that picks strategies xt n so that the
average Blackwell payoff is close to the target set S = n
0—is a regret minimizer for n. In particular, recall
that the external regret
RT := max
ˆxn
T
t=1
(t) ˆx
T
t=1
(t)xt
cumulated by strategies xt with respect to any sequence of utilities t satisfies the inquality.
RT
T min
ˆsn
0




ˆs 1
T
T
t=1
u(xt, t)




2
. (1)
As we already mentioned, we are interested in solving the Blackwell game Γ by means of the general
framework introduced in Lecture 8, which for the particular case of target set n
0 boils down to Algorithm 1.
Algorithm 1: (Predictive) Blackwell approachability for simplex domain
Data: RS (predictive) regret minimizer for n
0
1 function NextStrategy(vt)
[ Set vt = 0 for the non-predictive version)]
2 θt ← RS .NextStrategy(vt)
3 if θt ̸ = 0 then return xt θt/θt1 n
4 else return an arbitrary point xt n
5 function ReceivePayoff(u(xt, t))
6 RS .ObserveLoss(u(xt, t))
Algorithm 1 gives a way to solve Γ starting from any regret minimizer RS for the nonpositive orthant
n
0. In the rest of the lecture we will explore what happens when RS is set to FTRL and OMD, as well as
their predictive variants.
2 Recovering regret matching (RM) and regret matching plus (RM+)
In this section we show that when the Blackwell game Γ is solved by means of Algorithm 1 instantiated with
RS set to FTRL, the regret matching (RM) algorithm is recovered [Farina et al., 2021]. Even more surprising,
when RS is set to OMD the regret matching plus (RM+) algorithm is recovered instead. We will use these
connections for two purposes:
The fact that RM+ can be recovered from Algorithm 1 (which was proven sound for every choice of
regret minimizer RS in Lecture 8) immediately implies correctness of RM+.Even better, the connections
between FTRL, OMD and RM, RM+ will enable us to quickly give a regret bound for RM and RM+
starting from the known regret bounds for FTRL and OMD seen in Lecture 7. We do so in Section 2.3.
The connections suggest that if we started instead from the predictive versions of FTRL and OMD, we
could hope to arrive to predictive versions of RM and RM+, respectively. We will show that that is
indeed the case in Section 3.
Algorithm 2: Regret matching
1 r0 0 n, x0 1/n n
2 function NextStrategy()
3 if θt ̸ = 0 return xt θt / θt1
4 else return xt any point in n
5 function ObserveUtility(t)
6 θt+1 θt + t − ⟨t, xt1
Algorithm 3: Regret matching+
1 z0 0 n, x0 1/n n
2 function NextStrategy()
3 if θt ̸ = 0 return xt θt / θt1
4 else return xt any point in n
5 function ObserveUtility(t)
6 θt+1 [θt + t − ⟨t, xt1]+
2.1 FTRL leads to regret matching (RM)
The regret minimizer RS is used in Algorithm 1 to pick the next vector θt n
0 to force observes utilities
u(xt, t) = t − ⟨t, xt1. Consider now RS to be the FTRL algorithm with regularizer φ = 1
2 ∥ · ∥2
2 and step
size η > 0 (recalled in Algorithm 4). In that case, the vector θt (Line 2 in Algorithm 1) has the closed-form
solution
θt = arg max
ˆθn
0



( T
t=1
u(xt, t)
)
ˆθ − ∥ ˆθ2
2
2η


= η
[ T
t=1
u(xt, t)
[+
= η
[ T
t=1
t − ⟨t, xt1
[+
.
Since the forcing action θt/θt1 is invariant to positive constants, we see that the action xt picked by ??
η > 0 and is computed as
xt =
[∑T
t=1 t − ⟨t, xt1
]+




[∑T
t=1 t − ⟨t, xt1
]+


1
. (2)
provided θt ̸ = 0, and is an arbitrary vector xt n otherwise. These iterates coincide at all times t with the
iterates produced by the regret matching algorithm seen in Lecture 4 and recalled in Algorithm 2.
Algorithm 4: (Predictive) FTRL
Data: X ⊆ n convex and compact set
φ : X → 0 strongly convex regularizer
η > 0 step-size parameter
1 L0 0 n
2 function NextStrategy(mt)
[ Set mt = 0 for non-predictive version]
3 return arg max
ˆx∈X
{
(Lt1 + mt) ˆx 1
η φ( ˆx)
}
4 function ObserveUtility(t)
5 Lt Lt1 + t
Algorithm 5: (Predictive) OMD
Data: X ⊆ n convex and compact set
φ : X → 0 strongly convex regularizer
η > 0 step-size parameter
1 z0 ∈ X such that φ(z0) = 0
2 function NextStrategy(mt)
[ Set mt = 0 for non-predictive version]
3 return arg max
ˆx∈X
{
(mt) ˆx 1
η Dφ( ˆx zt1)
}
4 function ObserveUtility(t)
5 zt arg max
ˆz∈X
{
(t) ˆz 1
η Dφ( ˆz zt1)
}
2.2 OMD corresponds to regret matching plus (RM+)
Consider now RS to be the OMD algorithm with the regularizer φ = 1
2 ∥ · ∥2
2 and step size η > 0 (recalled in
Algorithm 5). In that case, the vector θt (Line 2 in Algorithm 1) has the closed-form solution
θt = arg max
ˆθn
0
{
u(xt, t) ˆθ − ∥ ˆθ θt12
2
2η
}
= [θt1 + η u(xt, t)]+. (3)
Since (3) is homogeneous in η > 0 (that is, the only effect of η is to rescale all θt by the same constant)
and the forcing action θt/θt1 is invariant to positive rescaling of θt, we see that Algorithm 1 outputs the
same iterates no matter the choice of stepsize parameter η > 0. In particular, we can assume without loss of
generality that η = 1. In that case, Equation (3) corresponds exactly to Line 6 in RM+ (Algorithm 3).
2.3 Regret Analysis
The connectsion between regret matching (RM), regret matching plus (RM+) and FTRL, OMD we uncovered
in Sections 2.1 and 2.2 can help us establish regret bounds for RM and RM+ starting from the regret bounds
for FTRL and OMD. To do so, let’s start from recalling the relationship—seen in Lecture 8—between the
regret of RS and the distance of the average Blackwell payoff to the target set, that is,
min
ˆsn
0




ˆs + 1
T
T
t=1
u(xt, t)




2
1
T max
ˆθn
0 𝔹n
2
RT
S ( ˆθ). (4)
Combining (4) with (1) , we obtain that the regret cumulated by the sequence of strategies xt produced by
Algorithm 1 with respect to any sequence of utilities t satisfies
1
T RT 1
T max
ˆθn
0𝔹n
2
RT
S ( ˆθ) = RT max
ˆθn
0𝔹n
2
RT
S ( ˆθ), (5)
where RT
S is the regret cumulated by the regret minimizer RS oracle used in Algorithm 1. As we know from
Lecture 7, both FTRL and OMD with regularizer φ = 1
2 ∥ · ∥2
2 and step size η > 0 guarantee that
RT
S ( ˆθ) ≤ ∥ ˆθ2
2
2η + η
T
t=1
u(xt, t)2
2 = max
ˆθn
0 𝔹n
2
RT
S ( ˆθ) 1
2η + η
T
t=1
u(xt, t)2
2, (6)
where we used the fact that ˆθ 𝔹n
2 on the right side of the implication. So, plugging (6) into (5), we have
RT 1
2η + η
T
t=1
u(xt, t)2
2.
Since we have shown above that the iterates produced by regret matching (Section 2.1) and regret matching
plus (Section 2.2) are independent of η > 0, we can minimize the right-hand side over η > 0, obtaining the
bound
RT



2
T
t=1


u(xt, t)


2
2
.
Finally, expanding the definition of u(xt, t) := t, xt1 t, we obtain the following statement.
Theorem 2.1. At every time T , the regret cumulated by the regret matching (Algorithm 2) and regret
matching plus algorithms (Algorithm 3) satisfy the regret bound
RT



2
T
t=1


t − ⟨t, xt1


2
2.
3 Predictive regret matching and regret matching plus
We can repeat the same analysis we did in Section 2.1 (which used FTRL) and Section 2.2 (which used OMD)
using the predictive versions of FTRL and OMD. The resulting algorithms are again independent on the
stepsize parameter, and are given in Algorithm 6 and Algorithm 7.
Algorithm 6: (Predictive) regret matching
1 r0 0 n, x0 1/n n
2 function NextStrategy(mt)
[ Set mt = 0 for non-predictive version
3 θt [rt1 + mt, xt11 mt]+
4 if θt ̸ = 0 return xt θt / θt1
5 else return xt arbitrary point in n
6 function ObserveLoss(t)
7 rt rt1 + t, xt1 t
Algorithm 7: (Predictive) regret matching+
1 z0 0 n, x0 1/n n
2 function NextStrategy(mt)
[ Set mt = 0 for non-predictive version
3 θt [zt1 + mt, xt11 mt]+
4 if θt ̸ = 0 return xt θt / θt1
5 else return xt arbitrary point in n
6 function ObserveLoss(t)
7 zt [zt1 + t, xt1 t]+
The same regret analysis of Section 2.3 holds verbatim. In particular, we have the following.
Theorem 3.1. At every time T , the regret cumulated by the predictive regret matching (Algorithm 6)
and predictive regret matching plus algorithms (Algorithm 7) satisfy the regret bound
RT



2
T
t=1


(t mt) − ⟨t mt, xt1


2
2
.
References
Jacob Abernethy, Peter L Bartlett, and Elad Hazan. Blackwell approachability and no-regret learning are
equivalent. In Proceedings of the Conference on Learning Theory (COLT), pages 27–46, 2011.
Gabriele Farina, Christian Kroer, and Tuomas Sandholm. Faster game solving via predictive Blackwell
approachability: Connecting regret matching and mirror descent. Proceedings of the AAAI Conference on
Artificial Intelligence (AAAI), 2021.

Content

PDF File

Typo or question?

Get in touch!
gfarina AT mit.edu

Metadata

Course: CMU 15-888
Term: Fall 2021
Date: 2021-10-05