next up previous
Next: Empirical Results and Discussions Up: Point-Based DP Update: The Previous: Convergence of Modified Value

Computing the Bellman Residual

In the modified value iteration algorithm, the input tex2html_wrap_inline1142 to standard DP update is always uniformly improvable. As such, its output tex2html_wrap_inline1144 dominates its input. This fact can be used to simplify the computation of the Bellman residual. As a matter of fact, the Bellman residual tex2html_wrap_inline1240 reduces tex2html_wrap_inline1698 .

To compute the latter quantity, one goes through the vectors in tex2html_wrap_inline1144 one by one. For each vector, one solves the linear program LP tex2html_wrap_inline1702 . The quantity is simply the maximum of the optimal values of the objective functions of the linear programs. Without uniformly improvability, we would have to repeat the process one more time with the roles of tex2html_wrap_inline1142 and tex2html_wrap_inline1144 exchanged.

Dr. Lian Wen Zhang
Thu Feb 15 14:47:09 HKT 2001