In the modified value iteration algorithm, the input to standard DP update is always uniformly improvable. As such, its output dominates its input. This fact can be used to simplify the computation of the Bellman residual. As a matter of fact, the Bellman residual reduces .
To compute the latter quantity, one goes through the vectors in one by one. For each vector, one solves the linear program LP . The quantity is simply the maximum of the optimal values of the objective functions of the linear programs. Without uniformly improvability, we would have to repeat the process one more time with the roles of and exchanged.