
15-888 Computational Game Solving (Fall 2021) Tue, September 21th 2021
Lecture 5
Regret circuits and the Counterfactual Regret Minimization (CFR) paradigm
Instructor: Gabriele Farina
In Lecture 4 we studied online optimization for simplex domains. In this section we show that it is possible
to construct an efficient regret minimizer that outputs sequence-form strategies, by combining any local
regret minimizers at each decision point of the tree-form sequential decision making, including RM and RM+
discussed before. The resulting algorithm has been known since the seminal work by Zinkevich et al. [2007]
under the name counterfactual regret minimization (CFR).
The analysis we present is based on the formalism of regret circuits [Farina et al., 2019].
1 Regret circuits
Regret circuits are a constructive answer to the question: “given regret minimizers for two sets X , Y, can we
compose these regret minimizers for various operations performed on the sets (e.g., intersection, convex hull,
Cartesian product), in order to arrive at a regret minimizer for the resulting composite set?”. In particular, in
the following subsections, we will show that once regret minimizers for two sets X and Y are known, they can
always be efficiently combined to come up with composite regret minimizers for the Cartesian product X × Y
and the convex hull co{X , Y}.
Since, as we say in Lecture 2, any sequence-form strategy space Q can be defined inductively starting from
convex hulls and Cartesian products, we can keep combining regret circuits until we construct an algorithm
for Q. The resulting family of algorithms is known as counterfactual regret minimization (CFR).
1.1 Regret circuit for Cartesian product
Let X ⊆ m and Y ⊆ n be two sets, and let RX and RY be regret minimizers for X and Y respectively. We
can combine RX and RY to form a regret minimizer for the Cartesian product X × Y ⊆ m+n by doing the
following:
To output the next strategy, we ask RX and RY for their next strategies—call them xt and yt—and
return the pair (xt, yt) ∈ X × Y.
Any utility vector ` m+n can be written as ` = (`X , `Y ) m × n. When at time t we receive the
utility vector `t = (`t
X , `t
Y ) we forward `t
X to RX and `t
Y to Y.
The algorithm we just described is summarized in Algorithm 1 and given pictorially in Figure 1.
Algorithm 1: Regret minimizer for X × Y
1 function NextStrategy()
2 xt ← RX .NextStrategy()
3 yt ← RY .NextStrategy()
4 return (xt, yt) ∈ X × Y
5 function ObserveUtility(`t = (`t
X , `t
Y ) m × n)
6 RX .ObserveUtility(`t
X )
7 RY .ObserveUtility(`t
Y )
RX
RY
xt
yt
(xt, yt)`t1
`t1
X
`t1
Y
Figure 1: Regret circuit for the Cartesian product
X × Y.
Analysis The regret minimizers RX and RY output strategies xt, yt and observe utilities `t
X , `t
Y at all time
t. Hence, the regret cumulated by RX and RY up to time T is given by
RT
X = max
ˆx∈X
{ T
t=1
(`t
X )> ˆx (`t
X )>xt
}
RT
Y = max
ˆy∈Y
{ T
t=1
(`t
Y )> ˆy (`t
Y )>yt
}
.
In order to verify that Algorithm 1 indeed yields a regret minimizer we study the regret cumulated by
Algorithm 1 up to any time T . Starting from the definition of regret, we have
RT
X ×Y = max
( ˆx, ˆy)∈X ×Y
{ T
t=1
(`t
X , `t
Y )>( ˆx, ˆy) (`t
X , `t
Y )>(xt, yt)
}
= max
ˆx∈X
{ T
t=1
(`t
X )> ˆx
T
t=1
(`t
X )>xt
}
+ min
ˆy∈Y
{ T
t=1
(`t
Y )> ˆy
T
t=1
(`t
Y )>yt
}
= RT
X + RT
Y .
So, if RX and RY guarantee sublinear regret, then so does Algorithm 1. In other words, it is possible
to minimize regret on X × Y by simply minimizing it on X and Y independently and then combining the
decisions.
In summary, we have proven the following result.
Theorem 1.1. The regret circuit of Figure 1 provides a regret minimizer for the Cartesian product X × Y
that satisfies the regret bound
RT
X ×Y = RT
X + RT
Y .
The extension to the Cartesian product of more than two sets is direct.
1.2 Regret circuit for convex hull
We now show how to combine RX and RY for two sets X ⊆ n and Y ⊆ n to construct a regret minimizer
for the convex hull co{X , Y}. The construction is harder than in the case of Cartesian products, but some of
the ideas carry over. Then, we will use a third regret minimizer R for the 2-simplex 2 decide how to “mix”
xt and yt. More precisely, we will do the following.
To output the next strategy, we ask RX and RY for their next strategies—call them xt and yt.
Then, we will ask R for its next strategy λt = (λt
1, λt
2) 2, and return the convex combination
λt
1 xt + λt
2 yt co{X , Y}.
When at time t we receive the utility vector `t n, we forward `t to RX and RY . Then, we forward
the utility vector
`t
:=
((`t)>xt
(`t)>yt
)
. (1)
to R.
The algorithm we just described is summarized in Algorithm 2 and given pictorially in Figure 2.
Algorithm 2: Regret minimzer for co{X , Y}
1 function NextStrategy()
2 xt ← RX .NextStrategy()
3 yt ← RY .NextStrategy()
4 λt = (λt
1, λt
2) 2 ← R.NextStrategy()
5 return λt
1xt + λt
2yt co{X , Y}
6 function ObserveUtility(`t)
7 RX .ObserveUtility(`t)
8 RY .ObserveUtility(`t)
9 R.ObserveUtility
(
`t
:=
((`t)>xt
(`t)>yt
))
RX
RY
R
xt
xt1
yt
yt1
λt
1xt + λt
2yt
`t1 `t1
λt
Figure 2: Regret circuit for the convex hull co{X , Y}. T
utility vector `t
λ is defined in Equation (1).
Analysis The regret minimizers RX and RY output strategies xt, yt and observe utilities `t at all time t.
Hence, the regret cumulated by RX and RY up to time T is given by
RT
X = max
ˆx∈X
{ T
t=1
(`t)> ˆx (`t)>xt
}
(2)
RT
Y = max
ˆy∈Y
{ T
t=1
(`t)> ˆy (`t)>yt
}
. (3)
The regret minimizer R outputs strategies λt = (λt
1, λt
2) at all times t and observes utility vectors `t
at all
times t. So, the regret cumulated by R is given by
RT
:= max
ˆλ2
{( T
t=1
ˆλ1(`t)>xt + ˆλ2(`t)>yt
)}

( T
t=1
λt
1(`t)>xt + λt
2(`t)>yt
)
.
In order to verify that Algorithm 2 indeed yields a regret minimizer we study the regret it cumulates up to
any time T . Starting from the definition of regret, we have
RT = max
ˆλ2
ˆx∈X , ˆy∈Y
{ T
t=1
(`t)>λ1 ˆx + ˆλ2 ˆy) (`t)>(λt
1xt + λt
2yt)
}
= max
ˆλ2
ˆx∈X , ˆy∈Y
{
ˆλ1
T
t=1
(`t)> ˆx + ˆλ2
T
t=1
(`t)> ˆy
}

( T
t=1
λt
1 (`t)>xt + λt
2 (`t)>yt
)
. (4)
Now, we make two important observations. First,
max
ˆλ2
ˆx∈X , ˆy∈Y
{
ˆλ1
T
t=1
(`t)> ˆx + ˆλ2
T
t=1
(`t)> ˆy
}
= max
ˆλ2
{
ˆλ1 max
ˆx∈X
{ T
t=1
(`t)> ˆx
}
+ ˆλ2 max
y∈Y
{ T
t=1
(`t)> ˆy
}}
, (5)
since all components of ˆλ = ( ˆλ1, ˆλ2) 2 are non-negative. Second, the inner minimization problem over X
is related to the cumulative regret RT
X (2) of the regret minimizer RX that observes the utility functions `t as
follows:
max
ˆx∈X
{ T
t=1
(`t)> ˆx
}
= RT
X +
T
t=1
(`t)>xt. (6)
Similarly, for Y we have from (3) that
max
ˆy∈Y
{ T
t=1
(`t)> ˆy
}
= RT
Y +
T
t=1
(`t)>yt. (7)
Plugging Equations (5), (6), and (7) into (4), we obtain
RT = max
ˆλ2
{
ˆλ1 max
ˆx∈X
{ T
t=1
(`t)> ˆx
}
+ ˆλ2 max
y∈Y
{ T
t=1
(`t)> ˆy
}}

( T
t=1
λt
1(`t)>xt + λt
2(`t)>yt
)
= max
ˆλ2
{
ˆλ1
(
RT
X +
T
t=1
(`t)>xt
)
+ ˆλ2
(
RT
Y +
T
t=1
(`t)>yt
)}

( T
t=1
λt
1(`t)>xt + λt
2(`t)>yt
)
= max
ˆλ2
{( T
t=1
ˆλ1(`t)>xt + ˆλ2(`t)>yt
)
+
(ˆλ1RT
X + ˆλ2RT
Y
)}

( T
t=1
λt
1(`t)>xt + λt
2(`t)>yt
)
.
No matter the choice of λ1, ˆλ2) 2, the convex combination ˆλ1RT
X + ˆλ2RT
Y satisfies
ˆλ1RT
X + ˆλ2RT
Y max{RT
X , RT
Y }.
Hence, we can write
RT max
ˆλ2
{( T
t=1
ˆλ1(`t)>xt + ˆλ2(`t)>yt
)
+ max{RT
X , RT
Y }
}

( T
t=1
λt
1(`t)>xt + λt
2(`t)>yt
)
= max
ˆλ2
{( T
t=1
ˆλ1(`t)>xt + ˆλ2(`t)>yt
)}
+ max{RT
X , RT
Y } −
( T
t=1
λt
1(`t)>xt + λt
2(`t)>yt
)
= RT
+ max{RT
X , RT
Y }. (8)
Equation (8) guarantees that if all three regrets {RT
, RT
X , RT
Y } grow sublinearly, then so does RT . Figure 2
shows the regret circuit that corresponds to our construction above.
In summary, we have proven the following result.
Theorem 1.2. The regret circuit of Figure 2 provides a regret minimizer for the convex hull co{X , Y}
that satisfies the regret bound
RT RT
2 + max{RT
X , RT
Y }.
Extension to multiple sets The construction shown in Algorithm 2 and Figure 2 can be readily extended to
handle the convex hull co{X1, . . . , Xn} of n sets as follows.
At each time instant t, we let n regret minimizers RX1 , . . . , RXn (for X1, . . . , Xn, respectively) output
their next strategies xt
1, . . . , xt
n. Those strategies are combined according to the convex combination
coefficients λt output by the additional regret minimizer R for the n-simplex n to form the convex
combination strategy λt
1xt
1 + · · · + λt
nxt
n co{X1, . . . , Xn}.
The utility vector `t is fed into all the regret minimizers RXi (i = 1, . . . , n). Then, the utility vector `t
,
defined as
`t
=



(`t)>xt
1
...
(`t)>xt
n


,
is input into a regret minimizer for the n-dimensional simplex n.
With the same argument used earlier, one can easily show that the regret bound in that case is
RT RT
n + max{RT
X1 , . . . , RT
Xn }.
2 Counterfactual Regret Minimization
We have seen in Lecture 2 that a sequence-form strategy space can be characterized recursively by composing
convex hulls and Cartesian products operations. By applying the regret circuits described above, we can
then construct a regret minimizers for any sequence-form strategy space. The resulting regret minimizer is
called CFR. In a nutshell, CFR decomposes the problem of minimizing regret on the whole tree-form decision
problem into local regret minimization problems at each of the individual decision points j ∈ J . Any regret
minimizer Rj for simplex domains can be used to solve the local regret minimization problems. Popular
options are the regret matching algorithm, and the regret matching plus algorithm (Lecture 4).
Before giving pseudocode, we recall and introduce a bit of notation to deal with tree-form sequential
decision processes.
Notation for tree-form sequential decision processes We use the following notation for dealing with
tree-form sequential decision processes (TFSDPs), most of which was already introduced in Lecture 2. The
notation is also summarized in Table 1.
k1
j1 j2 j3
k2 k3 k4
j4 j5 j6
fold call fold call fold call
check raise check raise check raise
jack queen king
check raise check raise check raise
Figure 3: Tree-form sequential decision making process of the first acting player in the game of Kuhn poker.
We denote the set of decision points in the TFSDP as J , and the set of observation points as K. At
each decision point j ∈ J , the agent selects an action from the set Aj of available actions. At each
observation point k ∈ K, the agent observes a signal sk from the environment out of a set of possible
signals Sk.
(new!) We denote by ρ the transition function of the process. Picking action a Aj at decision point
j ∈ J results in the process transitioning to ρ(j, a) ∈ J ∪ K ∪ {⊥}, where denotes the end of the
decision process. Similarly, the process transitions to ρ(k, s) ∈ J ∪ K ∪ {⊥} after the agent observes
signal s Sk at observation point k ∈ K.
A pair (j, a) where j ∈ J and a Aj is called a sequence. The set of all sequences is denoted as
Σ := {(j, a) : j ∈ J , a Aj }. For notational convenience, we will often denote an element (j, a) in Σ as
ja without using parentheses.
Given a decision point j ∈ J , we denote by pj its parent sequence, defined as the last sequence (that
is, decision point-action pair) encountered on the path from the root of the decision process to j. If
the agent does not act before j (that is, j is the root of the process or only observation points are
encountered on the path from the root to j), we let pj = .
Symbol Description
J Set of decision points
Aj Set of legal actions at decision point j ∈ J
K Set of observation points
Sk Set of possible signals at observation point k ∈ K
ρ Transition function:
given j ∈ J and a Aj , ρ(j, a) returns the next point v ∈ J ∪ K in the decision tree that
is reached after selecting legal action a in j, or if the decision process ends;
given k ∈ K and s Sk, ρ(k, s) returns the next point v ∈ J ∪ K in the decision tree that
is reached after observing signal s in k, or if the decision process ends
Σ Set of sequences, defined as Σ := {(j, a) : j ∈ J , a Aj }
pj Parent sequence of decision point j ∈ J , defined as the last sequence (decision point-action
pair) on the path from the root of the TFSDP to decision point j; if the agent does not act
before j, pj =
Table 1: Summary of notation in TFSDPs.
As an example, consider the TFSDP faced by Player 1 in the game of Kuhn poker [Kuhn, 1950], depicted in
Figure 3, which we already introduced in Lecture 2. There, we have that J = {j1, . . . , j6} and K = {k1, . . . , k4}.
Aj1 = Sk4 = {check, raise}. Aj5 = {fold, call}. Sk1 = {jack, queen, king}. ρ(k3, check) = ρ(j2, raise) = .
ρ(k1, king) = j3. ρ(j2, check) = k3. pj4 = (j1, check). pj6 = (j3, check). pj1 = pj2 = pj3 = .
Notation for the components of vectors Any vector x |Σ| has, by definition, as many components as
sequences Σ. The component corresponding to a specific sequence ja Σ is denoted as x[ja]. Similarly, given
any decision point j ∈ J , any vector x |Aj | has as many components as the number of actions at j. The
component corresponding to a specific action a Aj is denoted x[a].
CFR algorithm Pseudocode for CFR is given in Algorithm 3. Note that the implementation is parametric
on the regret minimization algorithms Rj run locally at each decision point. Any regret minimizer Rj for
simplex domains can be used to solve the local regret minimization problems. Popular options are the regret
matching algorithm, and the regret matching plus algorithm (Lecture 4).
It can be shown that the regret cumulated by the CFR algorithm satisfies the following bound.
Proposition 2.1. Let RT
j (j ∈ J ) denote the regret cumulated up to time T by each of the regret
minimizers Rj . Then, the regret RT cumulated by Algorithm 3 up to time T satisfies
RT
j∈J
max{0, RT
j }.
It is then immediate to see that if each RT
j grows sublinearly in T , then so does RT .
Algorithm 3: CFR regret minimizer
Data: Rj , one regret minimizer for |Aj |; one for each decision point j ∈ J of the TFSDP.
1 function NextStrategy()
[. Step 1: we ask each of the Rj for their next strategy local at each decision point]
2 for each decision point j ∈ J do
3 bt
j |Aj | ← Rj .NextStrategy()
[. Step 2: we construct the sequence-form representation of the strategy that plays according to the
distribution bt
j at each decision point j ∈ J ]
4 xt = 0 |Σ|
5 for each decision point j ∈ J in top-down traversal order in the TFSDP do
6 for each action a Aj do
7 if pj = then
8 xt[ja] bt
j [a]
9 else
10 xt[ja] xt[pj ] · bt
j [a]
[. You should convince yourself that the vector xt we just filled in above is a valid sequence-form
strategy, that is, it satisfies the required consistency constraints we saw in Lecture 2. In symbols,
xt Q]
11 return xt
12 function ObserveUtility(`t |Σ|)
[. Step 1: we compute the expected utility for each subtree rooted at each node v ∈ J ∪ K]
13 V t empty dictionary [. eventually, it will map keys J ∪ K ∪ {⊥} to real numbers]
14 V t[] 0
15 for each node in the tree v ∈ J ∪ K in bottom-up traversal order in the TFSDP do
16 if v ∈ J then
17 Let j = v
18 V t[j]
aAj
bt
j [a] · (`t[ja] + V t[ρ(j, a)])
19 else
20 Let k = v
21 V t[k]
sSk
V t[ρ(k, s)]
[. Step 2: at each decision point j ∈ J , we now construct a local utility vector `t
j called
counterfactual utility]
22 for each decision point j ∈ J do
23 `t
j 0 |Aj |
24 for each action a Aj do
25 `t
j [a] `t[ja] + V t[ρ(j, a)]
26 Rj .ObserveUtility(`t
j )
References
Martin Zinkevich, Michael Bowling, Michael Johanson, and Carmelo Piccione. Regret minimization in games
with incomplete information. In Proceedings of the Annual Conference on Neural Information Processing
Systems (NIPS), 2007.
Gabriele Farina, Christian Kroer, and Tuomas Sandholm. Regret circuits: Composability of regret minimizers.
In Proceedings of the International Conference on Machine Learning (ICML), 2019.
H. W. Kuhn. A simplified two-person poker. In H. W. Kuhn and A. W. Tucker, editors, Contributions to the
Theory of Games, volume 1 of Annals of Mathematics Studies, 24, pages 97–103. Princeton University Press,
Princeton, New Jersey, 1950.

Content

PDF File

Typo or question?

Get in touch!
gfarina AT mit.edu

Metadata

Course: CMU 15-888
Term: Fall 2021
Date: 2021-09-21