Meta-Level Control for Decision-Theoretic Planners

Richard Goodwin

School of Computer Science, Carnegie Mellon University

Agents plan in order to improve their performance, but planning takes time and other resources that can degrade performance. To plan effectively, an agent needs to be able to create high quality plans efficiently. Artificial intelligence planning techniques provide methods for generating plans, whereas decision theory offers expected utility as a measure for assessing plan quality, taking the value of each outcome and its likelihood into account. The benefits of combining artificial intelligence planning techniques and decision theory have long been recognized. However, these benefits will remain unrealized if the resulting decision-theoretic planners cannot generate plans with high expected utility in a timely fashion. In this dissertation, we address the meta-level control problem of allocating computation to make decision-theoretic planning efficient and effective.

For efficiency, decision-theoretic planners iteratively approximate the complete solution to a decision problem: planners generate partially elaborated, abstract plans; only promising plans are further refined, and execution may begin before a plan with the highest expected utility is found. Our work addresses three key meta-level control questions related to the planning process: whether to generate more plans or refine an existing partial plan, which part of a partial plan to refine, and when to commence execution. We show that an optimistic strategy that refines the plan with the highest bounds on expected utility first uses minimal computation when looking for a plan with the highest expected utility. When looking for a satisficing solution, we weigh the opportunity cost of forgoing more planning against the computational cost to decide whether to generating more plans. When selecting which part of a plan to refine, we use sensitivity analysis to identify refinements that can quickly distinguish plans with high expected utility. For deciding when to begin execution, previous methods have ignored the possibility of overlapping planning and execution. By taking this possibility into account, our method can improve performance by accomplishing a task more quickly. To validate our theoretical results, our methods have been applied to four decision-theoretic planners used in domains such as mobile robot route planning and medical treatment planning. Empirical tests against competing meta-level control methods show the effectiveness of our approach.

Postscript