 
 
 
 
 
   
 )
)
The simplest approach for using planning graphs for belief space
planning heuristics is to use a ``classical'' planning graph. To
form the initial literal layer from the projected belief state, we
could either sample a single state (denoted  ) or use an
aggregate state (denoted
) or use an
aggregate state (denoted  ). For example, in CBTC (see Figure
5) assuming regression search with
). For example, in CBTC (see Figure
5) assuming regression search with 
 , the
initial level
, the
initial level 
 of the planning graph for
 of the planning graph for  might
be:
 might
be:
 = {arm, clog, inP1,
 = {arm, clog, inP1,  inP2}
inP2}
and for  it is defined by the aggregate state
 it is defined by the aggregate state
 :
:
 = {arm, clog, inP1, inP2,
 = {arm, clog, inP1, inP2,  inP1,
inP1,
 inP2}.
inP2}.
Since these two versions of the single planning graph have
identical semantics, aside from the initial literal layer, we
proceed by describing the  graph and point out differences
with
 graph and point out differences
with  where they arise.
 where they arise.
Graph construction is identical to classical planning graphs
(including mutex propagation) and stops when two subsequent literal
layers are identical (level off).  We use the planning graph
formalism used in IPP [20] to allow for explicit
representation of conditional effects, meaning there is a literal
layer 
 , an action layer
, an action layer 
 , and an effect
layer
, and an effect
layer 
 in each level
 in each level  .  Persistence for a literal
.  Persistence for a literal
 , denoted by
, denoted by  , is represented as an action where
, is represented as an action where
 .  A literal is in
.  A literal is in 
 if an effect from the previous effect layer
 if an effect from the previous effect layer 
 contains the literal in its consequent.  An action is in the action
layer
contains the literal in its consequent.  An action is in the action
layer 
 if every one of its execution precondition
literals is in
 if every one of its execution precondition
literals is in 
 .  An effect is in the effect layer
.  An effect is in the effect layer
 if its associated action is in the action layer
 if its associated action is in the action layer 
 and every one of its antecedent literals is in
 and every one of its antecedent literals is in 
 .
Using conditional effects in the planning graph avoids factoring an
action with conditional effects into a possibly exponential number
of non-conditional actions, but adds an extra planning graph layer
per level.  Once our graph is built, we can extract heuristics.
.
Using conditional effects in the planning graph avoids factoring an
action with conditional effects into a possibly exponential number
of non-conditional actions, but adds an extra planning graph layer
per level.  Once our graph is built, we can extract heuristics.
No Aggregation:  Relaxed plans within a single planning graph
are able to measure, under the most optimistic assumptions, the
distance between two belief states.  The relaxed plan represents a
distance between a subset of the initial layer literals and the
literals in a constituent of our belief state.  In the  , the
literals from the initial layer that are used for support may not
hold in a single state of the projected belief state, unlike the
, the
literals from the initial layer that are used for support may not
hold in a single state of the projected belief state, unlike the
 . The classical relaxed plan heuristic
. The classical relaxed plan heuristic 
 finds a
set of (possibly interfering) actions to support the goal
constituent. The relaxed plan
 finds a
set of (possibly interfering) actions to support the goal
constituent. The relaxed plan  is a subgraph of the planning
graph, of the form {
 is a subgraph of the planning
graph, of the form {
 ,
, 
 ,
,
 , ...,
, ..., 
 ,
, 
 ,
, 
 }. Each of the layers contains a
subset of the vertices in the corresponding layer of the planning
graph.
}. Each of the layers contains a
subset of the vertices in the corresponding layer of the planning
graph.
More formally, we find the relaxed plan to support the constituent
 that is reached earliest in the graph
(as found by the
 that is reached earliest in the graph
(as found by the 
 heuristic in Appendix A).
Briefly,
 heuristic in Appendix A).
Briefly, 
 returns the first level
 returns the first level  where a
constituent of
 where a
constituent of  has all its literals in
 has all its literals in 
 and none
are marked pair-wise mutex.  Notice that this is how we incorporate
negative interactions into our heuristics.  We start extraction at
the level
 and none
are marked pair-wise mutex.  Notice that this is how we incorporate
negative interactions into our heuristics.  We start extraction at
the level  , by defining
, by defining 
 as the literals in
the constituent used in the level heuristic. For each literal
 as the literals in
the constituent used in the level heuristic. For each literal 
 , we select a supporting effect (ignoring mutexes)
from
, we select a supporting effect (ignoring mutexes)
from 
 to form the subset
 to form the subset 
 . We
prefer persistence of literals to effects in supporting literals.
Once a supporting set of effects is found, we create
. We
prefer persistence of literals to effects in supporting literals.
Once a supporting set of effects is found, we create 
 as all actions with an effect in
 as all actions with an effect in 
 . Then the needed preconditions for the actions and
antecedents for chosen effects in
. Then the needed preconditions for the actions and
antecedents for chosen effects in 
 and
 and 
 are added to the list of literals to support from
 are added to the list of literals to support from
 .  The algorithm repeats until we find the
needed actions from
.  The algorithm repeats until we find the
needed actions from 
 . A relaxed plan's value is the
summation of the number of actions in each action layer. A literal
persistence, denoted by a subscript ``p'', is treated as an action
in the planning graph, but in a relaxed plan we do not include it in
the final computation of
. A relaxed plan's value is the
summation of the number of actions in each action layer. A literal
persistence, denoted by a subscript ``p'', is treated as an action
in the planning graph, but in a relaxed plan we do not include it in
the final computation of 
 . The single
graph relaxed plan heuristic is computed as
. The single
graph relaxed plan heuristic is computed as
For the CBTC problem we find a relaxed plan from the  , as
shown in Figure 5 as the bold edges and nodes.  Since
, as
shown in Figure 5 as the bold edges and nodes.  Since
 arm and
arm and  clog are non mutex at level two, we can use
persistence to support
clog are non mutex at level two, we can use
persistence to support  clog and DunkP1 to support
clog and DunkP1 to support  arm in
arm in
 . In
. In 
 we can use persistence
for inP1, and Flush for
 we can use persistence
for inP1, and Flush for  clog. Thus,
clog. Thus, 
 = 2
because the relaxed plan is:
 = 2
because the relaxed plan is:
 inP1
inP1 , Flush
, Flush ,
,
 inP1
inP1 ,
, 
 Flush
Flush },
},
 inP1
inP1 clog
clog ,
,
 clog
clog , DunkP1
, DunkP1 ,
,
 clog
clog ,
,
 DunkP1
DunkP1 },
},
 arm
arm clog
clog .
.
The relaxed plan does not use both DunkP2 and DunkP1 to support
 arm.  As a result
arm.  As a result  arm is not supported in all worlds
(i.e. it is not supported when the state where inP2 holds is our
initial state). Our initial literal layer threw away knowledge of
inP1 and inP2 holding in different worlds, and the relaxed plan
extraction ignored the fact that
arm is not supported in all worlds
(i.e. it is not supported when the state where inP2 holds is our
initial state). Our initial literal layer threw away knowledge of
inP1 and inP2 holding in different worlds, and the relaxed plan
extraction ignored the fact that  arm needs to be supported in
all worlds. Even with an
arm needs to be supported in
all worlds. Even with an  graph, we see similar behavior
because we are reasoning with only a single world.  A single,
unmodified classical planning graph cannot capture support from all
possible worlds - hence there is no explicit aggregation over
distance measures for states.  As a result, we do not mention
aggregating states to measure positive interaction, independence, or
overlap.
 graph, we see similar behavior
because we are reasoning with only a single world.  A single,
unmodified classical planning graph cannot capture support from all
possible worlds - hence there is no explicit aggregation over
distance measures for states.  As a result, we do not mention
aggregating states to measure positive interaction, independence, or
overlap.
 
 
 
 
