Robotics Institute, School of Computer Science, Carnegie Mellon University
This work analyzes the relationship between flexible computation and robustness analysis. We start from the assumption that computational effort must be spent only when it can realistically affect decisions. We use robustness analysis to detect when decisions are (or are not) affected by changes in the agent state. As computational effort is aimed at changing the agent state, robustness analysis is the right tool for evaluating the impact of computational effort. We present models and methods that lead to efficient robustness analysis, particularly in the domain of maximum a posteriori estimation. We demonstrate this approach in an image processing system that estimates position from outdoor images.
These pages contain the paper presented at the AAAI Fall Symposium, Flexible Computation in Intelligent Systems, November 1996.
This work analyzes the relationship between flexible computation and robustness analysis. Decisions are robust when they are acceptable across variations in the decision maker state. The idea pursued here is that effort must halt when decisions cannot be reversed or defeated by changes in the agent state: computational effort is spent only when it can realistically affect decisions. As robustness analysis is concerned with the impact of changes in the agent state, it is the right tool for evaluating computational effort.
The key observation is that we can conduct robustness analysis efficiently. We benefit from recent breakthroughs in the theory of robust Bayesian Statistics [Berger1985], which demonstrate the great convenience of the bounded ratio family for robustness analysis [DeRobertis & Hartigan1981, Wasserman1992].
Our approach grew out of experience with an image processing system that estimates position from outdoor images; we use this system as a case study. We model measurements through bounded ratio families. Position estimation begins with simplified measurements, which are used to narrow the search to a small number of possible estimates. Robustness of the estimates is then verified; if robustness fails, a refined model is used in a new estimation round. The estimation procedure terminates when a single robust estimate emerges, or no more refinement is possible for the image measurements. This procedure uses robustness analysis to suggest the most efficient use of computation.
This work proposes an approach for flexible computation which is based on robustness as a measure of value of computation. A classic approach for evaluation of computational value is to obtain the expected value of computation and proceed if expected gains are larger than expected costs [Heckerman & Jimison1989, Horvitz1989, Matheson1968, Russell & Wefald1991]. In the robustness approach, the gains from computation are judged with respect to computations already performed, not expected results.
A simplified example will clarify the robustness approach. Consider a decision problem with two possible actions, a1 and a2. Suppose the utility of each action depends on a state theta. Utilities are defined through functions u(a1, theta) for a1 and u(a2, theta) for a2. Beliefs about the state theta are translated to a distribution p(theta). Judgements about actions are made with respect to expected utility, i.e., the values
E[ai] = Integral[u(ai,theta) p(theta) dtheta].The best action is the action with largest expected utility.
For this example, suppose the distribution p(theta) is coded into a complex Bayesian network. There are methods that allow progressive construction of bounds for p(theta) [Draper & Hanks1994, Jaakkola & Jordan1996]. For a particular set of bounds, consider a credal set defined by all consistent distributions. Suppose this credal set can be indexed by alpha in [0,1], and call a member of the credal set by palpha(theta). For each palpha(theta) we can obtain Ealpha[ai], the expected value of action ai with distribution palpha(theta).
Consider a graph of alpha vs. Ealpha[ai]. Each interval of alpha values is a credal set. One such graph is depicted in Figure 1, for which we analyze three possible credal sets.
Figure: Expected utility Ealpha[ai] for actions a1 and a2
Robustness can be either an additional figure of merit for assessments of value of computation or a direct indicator of value of computation. The point is that robustness is an important measure for assessment of value of computation. The particular use of robustness analysis is problem dependent.
In general, we can think of a base level decision maker, who tries to make decisions with coarse models. Sometimes these coarse, abstracted models are sharp enough for a robust decision, sometimes they are not. A meta level decision maker looks at the robustness of the base level decisions and suggests possible refinements that require more computation. The important question for the meta level system is to detect when the abstracted details can be useful [Chrisman1992], exactly the question that robustness analysis is suited to answer.
Robust Bayesian tools form the basis of our analysis [Berger1985]. Such tools have been formalized and axiomatized; the name Quasi-Bayesian [Giron & Rios1980, Walley1991] refers to an axiomatic theory of decision that uses convex sets of distributions for reasoning about uncertainty. The set of distributions is called credal set [Levi1980] and represents the various distributions the agent deems possible at a given point. To study robustness of a decision, we analyze how much a decision is affected by variations within the credal set. Notice that, compared to the Bayesian framework for expected value of computations [Horvitz1989], the prescription above does not require ``meta-probabilities'' over expected probabilities. ``Meta-probabilities'' may be be hard to specify and elicit in a real problem.
The main question is: how can we quickly conduct robustness analysis? We present here a model that fulfills this requirement.
Recent results in the theory of robust Statistics allow us to perform robustness analysis with efficient models. In particular, it has been proven that the bounded ratio family of distributions can be manipulated quickly and still produce all necessary information for robustness analysis [DeRobertis & Hartigan1981]. A bounded ratio family is defined by two measures, l(A) and u(A), where A is an event. The family consists of all probability densities p(A) so that
l(A) <= alpha p(A) <= u(A) for some alpha.In this family of distributions, we can obtain posterior and marginal densities by using summations and Bayes rule on the measures l(A) and u(A) [Wasserman1992]. For any event A the minimum and maximum values of the probability of A are:
p(A) = l(A)/(l(A) + u(Ac))
p(A) = u(A)/(u(A) + l(Ac)) .
In this paper we concentrate on maximum a posteriori estimation. Estimation is viewed as decision making, where the actions are the estimates. Our purpose is to estimate variables by the maximum value of their posterior distribution. In this setting, robustness is attained when posterior probability ranges for different estimates do not overlap. For example, consider two estimates A and B. We have a robust situation when the posterior probability of A is inside [0.1,0.2] and the posterior probability of B is inside [0.7,0.8]. We have a situation that deserves further computational effort when the posterior probability intervals overlap.
Suppose the agent is trying to decide between two estimates, A and B. Applying Bayes rule and marginalization on l() and u(), the agent obtains l(A), u(A), l(B) and u(B). Four quantities can be computed:
u(A)/(u(A) + l(Ac))
l(A)/(l(A) + u(Ac)),
u(B)/(u(B) + l(Bc)),
l(B)/l(B) + u(Bc).
p(A) = l(A)/(l(A) + u(Ac)),
p(B) = u(B)/(u(B) + l(Bc)),
p(B) = l(B)/l(B) + u(Bc).
If p(B) <= p(A) or p(A) <= p(B), A or B are respectively robust estimates. If the ranges of posterior distributions overlap, further computation is necessary.
We apply the theory above to an estimation activity that arises in planetary robotic exploration. Figure 2 illustrates the idea we pursue. A rover in a planetary mission sends a stream of images to mission control. An interface receives the images and produces estimates for the position of the rover. A large map of the rover's environment is available; the operator can relate the estimates produced by the system with information in the map. The system automatically segments the images, extracts the horizon and searches for mountain structures in the horizon. These steps require a significant amount of computation, taking 5 to 6 seconds per image in a Silicon Graphics Impact workstation. Figure 3 shows the output of the image processing system for a sequence of images from Pittsburgh; the boxes indicate the structures detected by the image processing system.
Figure: An Interface for Remote Teleoperation of Mobile Robots
Figure: Result of Image Processing System
The key aspect we focus on is the search for a maximum a posteriori position estimate given the results of the image processing system. We have time constraints for this task as the operators must interact smoothly with the robot. After image processing, the system has to search for the best matches between image mountains and structures in the map. The search space is equal to the number of possible positions on the map. Maps have approximately 36km2, quantized at 30 meter intervals. This gives us more than 30,000 positions to be checked and evaluated. A position (x,y) is evaluated by considering differences between image structures and map structures that are visible from the point (x,y).
The search is much simplified if we look at a reduced description of the ``prominent'' features in the image (larger mountains and peaks). Here is the trade-off: we have to decrease the number of image features for efficiency purposes, but increase the number of image features to improve the reliability of the estimates. We must not consider all possible image features at once. Instead, we must use a small number of features and analyze the robustness of the resulting estimates. If we have a single robust estimate, we stop. If not, we must continue looking at more features, spending computation only on the estimates that show the potential to be the best. We now outline this robustness based approach.
To obtain a posterior distribution over positions (x,y), we need a prior distribution and likelihoods for the visual measurements. As we have no a priori information about position, we define a single uniform distribution over positions as our prior.
The formulation of a complete likelihood function relating the position and images (measurements) is extremely costly because it involves comparing the images from the robot to images obtained from the map [Thompson et al. 1990]. We have to simplify the description of image features. So, instead of specifying the complete likelihood to begin with, the estimation procedure uses an abstracted description of the images. Each image is converted to a vector of point features containing information about location of ``prominent'' image structures. To indicate the abstracted nature of this model, we represent the likelihood function by a bounded ratio family.
For a single mountain detected in the image, with image position b and height h, we have the lower and upper likelihood measures:
l(b, h| x, y) ~ alpha(h-h(x,y))
exp ( - ((b-b(x,y))2)/(sigmab2) )
u(b, h| x, y) ~ alpha(h-h(x,y))
exp ( - ((b-b(x,y))2)/(sigmab2) ) + k
u(b, h| x, y) ~ alpha(h-h(x,y)) exp ( - ((b-b(x,y))2)/(sigmab2) ) + k
The functions alpha() <= alpha() are defined by tables. In our implementation, the gap between these two functions increases as the height of an image feature increases. This gap represents the fact that larger image structures have potentially more information hidden in the abstracted model than small structures do. Tuning these various parameters is similar to assessing a Bayesian model, but here we can codify computational aspects that are absent in the single probability approach.
We start with a selection of a few image features, described as point features, and obtain ranges of posterior distribution from both the prior and likelihood measures described in the last section. We then check whether the ranges overlap for different estimates. If there are many overlapping candidate positions, additional image features and refined descriptions (such as roundness and spatial extent) are used to pin down the best estimate. Finally, the terrain is rendered as it would be seen from the candidate estimates, and the image from the rover is directly compared with the rendered images. The best candidate is the one that produces the best match between rover and rendered images. The procedure stops when a best estimate emerges or no image feature is available. Because we look only at complex features for a few overlapping estimates at any given step of the estimation procedure, the cost of the overall procedure is much smaller than a full Bayesian analysis.
This paper emphasized robustness as an effective measure for the value of computation. Direct robustness measures can either be incorporated into the usual expected value of computation metrics, or replace such metrics. The key challenge is to present techniques that allow a decision maker to quickly evaluate the robustness of a decision, which we met by proposing the bounded ratio model.
We focused on maximum a posteriori estimation. The robustness measure we used is the simplest: whether or not the posterior probabilities for two estimates overlap. As a case study, we described how we defined bounded ratio families for an image processing application, and how these models are used for flexible control of computation.
The approach here presented can be useful in a variety of circumstances. First, it may be easier to assess robustness than to elicit all quantities involved in the expected value of computation. Second, it may be natural to assess robustness when approximation algorithms provide probability bounds. Third, models like the bounded ratio family lead to very efficient methods for control of computational effort.
Fri Oct 25 15:54:50 EDT 1996