Suppose *V* and *U* are two value functions. We say
that *U* *dominates* *V* and write
if for every belief state *b*.
A value function *V* is said to be *uniformly improvable*
if .
A set of vectors *dominates* another
set of vectors if the value function induced
by dominates that induced by . A set of vectors is *unformly improvable*
if the value function it induces is.

This lemma is obvious and is well known in the MDP community
(Puterman 1990). Nonetheless, it enables us to
explain the intuition behind the term ``uniformly improvable".
Suppose *V* is a uniformly improvable value function and suppose
value iteration starts with *V*. Then the sequence of
value functions generated is monotonically increasing and converges
to the optimal value function .
This implies .
That is, *TV*(*b*) is closer to than *V*(*b*) for all
belief states *b*.

The following lemma will be used later to address the issues listed at the end of the previous section.

**Proof**: Since , we have by Lemma 1.
We also have the condition .
Consequently, . That is, *U* is
uniformly improvable.

Thu Feb 15 14:47:09 HKT 2001