Let be a set of vectors and *b* be a belief state.
The *backup operator* constructs a new vector in three steps:

- For each action
*a*and each observation*z*, find the vector in that has maximum inner product with -- the belief state for the case when*z*is observed after executing action*a*in belief state*b*. If there are more than one such vector, break ties lexicographically (Littman 1996). Denote the vector found by . - For each action
*a*, construct a vector by: - Find the vector, among the 's, that has
maximum inner product with
*b*. If there are more than one such vector, break ties lexicographically. Denote the vector found by .

It has been shown
(Smallwood and Sondik 1973, Littman 1996)
that is
a member of -- the set of vectors obtained by
performing DP update on . Moreover, *b*
is a witness point for .

The above fact is the corner stone of several DP update algorithms.
The one-pass algorithm (Sondik 1971),
the linear-support algorithm (Cheng 1988),
and the relaxed-region
algorithm (Cheng 1988) operate in the following
way: They first systematically search for witness points
of vectors in and then obtain the vectors using the
backup operator. The witness algorithm (Kaelbling
*et al.* 1998) employs a similar idea.

Thu Feb 15 14:47:09 HKT 2001