Our planning formulation uses regression search to find strong conformant plans and progression search to find strong conformant and conditional plans. A strong plan guarantees that after a finite number of actions executed from any of the many possible initial states, all resulting states are goal states. Conformant plans are a special case where the plan has no conditional plan branches, as in classical planning. Conditional plans are a more general case where plans are structured as a graph because they include conditional actions (i.e. the actions have causative and observational effects). In this presentation, we restrict conditional plans to DAGs, but there is no conceptual reason why they cannot be general graphs. Our plan quality metric is the maximum plan path length.

We formulate search in the space of belief states, a technique described by [6]. The planning problem is defined as the tuple , where is a domain description, is the initial belief state, and is the goal belief state (consisting of all states satisfying the goal). The domain is a tuple , where is a set of fluents and is a set of actions.

**Logical Formula Representation:** We make extensive use of
logical formulas over
to represent belief states, actions, and
labels, so we first explain a few conventions. We refer to
every fluent in
as either a positive literal or a negative
literal, either of which is denoted by
. When discussing the
literal
, the opposite polarity literal is denoted
. Thus
if
=
at(location1), then
= at(location1). We
reserve the symbols
and
to denote logical false and
true, respectively. Throughout the paper we define the conjunction
of an empty set equivalent to
, and the disjunction of an
empty set as
.

Logical formulas are propositional sentences comprised of literals,
disjunction, conjunction, and negation. We refer to the set of
models of a formula
as
. We consider the
disjunctive normal form of a logical formula
,
,
and the conjunctive normal form of
,
. The DNF is seen
as a disjunction of ``constituents''
each of which is a
conjunction of literals. Alternatively the CNF is seen as a
conjunction of ``clauses''
each of which is a disjunction of
literals.^{1} We find it useful to think of DNF and CNF represented as
sets - a disjunctive set of constituents or a conjunctive set of
clauses. We also refer to the complete representation
of a
formula
as a DNF where every constituent - or in this case
state
- is a model of
.

**Belief State Representation:** A world state,
, is represented
as a complete interpretation over fluents. We also refer to states
as possible worlds. A belief state
is a set of states and is
symbolically represented as a propositional formula over
. A
state
is in the set of states represented by a belief state
if
, or equivalently
.

For pedagogical purposes, we use the bomb and toilet with clogging
and sensing problem, BTCS, as a running example for this paper.^{2} BTCS is a
problem that includes two packages, one of which contains a bomb,
and there is also a toilet in which we can dunk packages to defuse
potential bombs. The goal is to disarm the bomb and the only
allowable actions are dunking a package in the toilet (DunkP1,
DunkP2), flushing the toilet after it becomes clogged from dunking
(Flush), and using a metal-detector to sense if a package contains
the bomb (DetectMetal). The fluents encoding the problem denote that
the bomb is armed (arm) or not, the bomb is in a package (inP1,
inP2) or not, and that the toilet is clogged (clog) or not. We also
consider a conformant variation on BTCS, called BTC, where there is
no DetectMetal action.

The belief state representation of the BTCS initial condition, in clausal representation is:

= arm clog (inP1 inP2) ( inP1 inP2),

and in constituent representation is:

= (arm clog inP1 inP2) (arm clog inP1 inP2).

The goal of BTCS has the clausal and constituent representation:

= = arm.

However, the goal has the complete representation:

= ( arm clog inP1 inP2) ( arm clog inP1 inP2)

( arm clog inP1 inP2) ( arm clog inP1 inP2)

( arm clog inP1 inP2) ( arm clog inP1 inP2)

( arm clog inP1 inP2) ( arm clog inP1 inP2).

The last four states (disjuncts) in the complete representation are unreachable, but consistent with the goal description.

**Action Representation:** We represent actions as having both
causative and observational effects. All actions
are described
by a tuple
where
is an execution precondition,
is a set of
causative effects, and
is a set of observations. The
execution precondition,
, is a conjunction of literals
that must hold for the action to be executable. If an action is
executable, we apply the set of causative effects to find successor
states and then apply the observations to partition the successor
states into observational classes.

Each causative effect is a conditional effect of the form , where the antecedent and consequent are both a conjunction of literals. We handle disjunction in or a by replicating the respective action or effect with different conditions, so with out loss of generality we assume conjunctive preconditions. However, we cannot split disjunction in the effects. Disjunction in an effect amounts to representing a set of non-deterministic outcomes. Hence we do not allow disjunction in effects thereby restricting to deterministic effects. By convention is an unconditional effect, which is equivalent to a conditional effect where .

The only way to obtain observations is to execute an action with observations. Each observation formula is a possible sensor reading. For example, an action that observes the truth values of two fluents and defines . This differs slightly from the conventional description of observations in the conditional planning literature. Some works (e.g., [28]) describe an observation as a list of observable formulas, then define possible sensor readings as all boolean combinations of the formulas. We directly define the possible sensor readings, as illustrated by our example. We note that our convention is helpful in problems where some boolean combinations of observable formulas will never be sensor readings.

The causative and sensory actions for the example BTCS problem are:

DunkP1: = clog, clog, inP1 arm ,

DunkP2: clog, clog, inP2 arm ,

Flush: , clog , and

DetectMetal: inP1, inP1 .