This section is greatly summarized! For the whole story, consult the compressed postscript version.
We wish to minimize the Bayes risk for a Gaussian prior with mean , variance by using a plan . Note that the number of observations is also a random variable to be averaged in the expectation.
Dynamic programming applied to this minimization problem leads to a value iteration algorithm [DeGroot1970]. Very briefly, the algorithm assumes that two initial guesses of the best risk are given, one smaller and one larger than the correct best risk.
Two iterations compose the algorithm, one for the "small" risk, another for the "larger" risk.
The algorithm is guaranteed to ``sandwich'' the best risk; i.e., the smaller and larger risk guesses converge monotonically to the best risk.
This produces the decision regions, with the Continue region between the and regions. Intuitively, the algorithm ``sandwiches'' the Indeterminate region.
Sun Jul 14 18:32:36 EDT 1996