As mentioned in Section 3.1, point-based update is closely related to several exact algorithms for standard update, namely one-pass (Sondik 1971), linear-support (Cheng 1988), and relaxed-region (Cheng 1988). They all backup on a finite number of belief points. The difference is that these exact algorithms generate the points systematically, which is expensive, while point-based update generate the points heuristically.

There are several other exact algorithms for
standard
DP update. The enumeration/reduction
algorithms (Monahan 1982, Eagle 1984)
and incremental pruning
(Zhang and Liu 1997, Cassandra *et al.* 1997)
first generate a set of vectors that are not parsimonious and
then prune extraneous vectors by solving linear programs.
Point-based DP update never generates extraneous vectors.
It might generate duplicate vectors. However, duplicates
are pruned without solving linear programs.
The witness algorithm (Kaelbling *et al.* 1998) has two stages.
In the first stage, it considers actions one by one. For
each action, it
constructs a set of vectors based on a finite number of
systematically generated belief points using an operator similar
to the backup operator.
In the second stage, vectors for different actions are pooled
together and extraneous vectors are pruned.

There are proposals to carry out standard update approximately
by dropping vectors that are marginally useful (e.g., Kaelbling *et al.*
1998, Hansen 1998).
Here is one idea along this line that we have empirically
evaluated. Recall that to achieve -optimality,
the stopping threshold for the Bellman residual should be
.
Our idea is to drop marginally useful vectors at various stages of
standard update while keeping the overall error under and
to stop when the Bellman residual falls below .
It is easy to see that -optimality is still guaranteed this way.
We have also tried to start with a large error tolerance in hope to
prune more vectors and gradually decrease the tolerance level
to . Reasonable improvements have been observed especially when
one does not need quality of policy to be high.
However such approximate updates are much more expensive
than point-based updates. In the context of the modified
value iteration algorithm, they are more suitable alternatives to
standard updates than point-based update.

Thu Feb 15 14:47:09 HKT 2001