Next: Regression Up: Planning Graph Heuristics for Previous: Introduction

# Belief Space Planners

Our planning formulation uses regression search to find strong conformant plans and progression search to find strong conformant and conditional plans. A strong plan guarantees that after a finite number of actions executed from any of the many possible initial states, all resulting states are goal states. Conformant plans are a special case where the plan has no conditional plan branches, as in classical planning. Conditional plans are a more general case where plans are structured as a graph because they include conditional actions (i.e. the actions have causative and observational effects). In this presentation, we restrict conditional plans to DAGs, but there is no conceptual reason why they cannot be general graphs. Our plan quality metric is the maximum plan path length.

We formulate search in the space of belief states, a technique described by [6]. The planning problem is defined as the tuple , where is a domain description, is the initial belief state, and is the goal belief state (consisting of all states satisfying the goal). The domain is a tuple , where is a set of fluents and is a set of actions.

Logical Formula Representation: We make extensive use of logical formulas over to represent belief states, actions, and labels, so we first explain a few conventions. We refer to every fluent in as either a positive literal or a negative literal, either of which is denoted by . When discussing the literal , the opposite polarity literal is denoted . Thus if = at(location1), then = at(location1). We reserve the symbols and to denote logical false and true, respectively. Throughout the paper we define the conjunction of an empty set equivalent to , and the disjunction of an empty set as .

Logical formulas are propositional sentences comprised of literals, disjunction, conjunction, and negation. We refer to the set of models of a formula as . We consider the disjunctive normal form of a logical formula , , and the conjunctive normal form of , . The DNF is seen as a disjunction of constituents'' each of which is a conjunction of literals. Alternatively the CNF is seen as a conjunction of clauses'' each of which is a disjunction of literals.1 We find it useful to think of DNF and CNF represented as sets - a disjunctive set of constituents or a conjunctive set of clauses. We also refer to the complete representation of a formula as a DNF where every constituent - or in this case state - is a model of .

Belief State Representation: A world state, , is represented as a complete interpretation over fluents. We also refer to states as possible worlds. A belief state is a set of states and is symbolically represented as a propositional formula over . A state is in the set of states represented by a belief state if , or equivalently .

For pedagogical purposes, we use the bomb and toilet with clogging and sensing problem, BTCS, as a running example for this paper.2 BTCS is a problem that includes two packages, one of which contains a bomb, and there is also a toilet in which we can dunk packages to defuse potential bombs. The goal is to disarm the bomb and the only allowable actions are dunking a package in the toilet (DunkP1, DunkP2), flushing the toilet after it becomes clogged from dunking (Flush), and using a metal-detector to sense if a package contains the bomb (DetectMetal). The fluents encoding the problem denote that the bomb is armed (arm) or not, the bomb is in a package (inP1, inP2) or not, and that the toilet is clogged (clog) or not. We also consider a conformant variation on BTCS, called BTC, where there is no DetectMetal action.

The belief state representation of the BTCS initial condition, in clausal representation is:

= arm clog (inP1 inP2) ( inP1 inP2),

and in constituent representation is:

The goal of BTCS has the clausal and constituent representation:

= = arm.

However, the goal has the complete representation:

The last four states (disjuncts) in the complete representation are unreachable, but consistent with the goal description.

Action Representation: We represent actions as having both causative and observational effects. All actions are described by a tuple where is an execution precondition, is a set of causative effects, and is a set of observations. The execution precondition, , is a conjunction of literals that must hold for the action to be executable. If an action is executable, we apply the set of causative effects to find successor states and then apply the observations to partition the successor states into observational classes.

Each causative effect is a conditional effect of the form , where the antecedent and consequent are both a conjunction of literals. We handle disjunction in or a by replicating the respective action or effect with different conditions, so with out loss of generality we assume conjunctive preconditions. However, we cannot split disjunction in the effects. Disjunction in an effect amounts to representing a set of non-deterministic outcomes. Hence we do not allow disjunction in effects thereby restricting to deterministic effects. By convention is an unconditional effect, which is equivalent to a conditional effect where .

The only way to obtain observations is to execute an action with observations. Each observation formula is a possible sensor reading. For example, an action that observes the truth values of two fluents and defines . This differs slightly from the conventional description of observations in the conditional planning literature. Some works (e.g., [28]) describe an observation as a list of observable formulas, then define possible sensor readings as all boolean combinations of the formulas. We directly define the possible sensor readings, as illustrated by our example. We note that our convention is helpful in problems where some boolean combinations of observable formulas will never be sensor readings.

The causative and sensory actions for the example BTCS problem are:

DunkP1: = clog, clog, inP1 arm ,

DunkP2: clog, clog, inP2 arm ,

Flush: , clog , and

DetectMetal: inP1, inP1 .

Subsections

Next: Regression Up: Planning Graph Heuristics for Previous: Introduction
2006-05-26