Probabilistic Effects

To define probabilistic and decision theoretic planning problems, we need to add support for probabilistic effects. The syntax for probabilistic effects is

(probabilistic p1 e1 ... pk ek)
meaning that effect ei occurs with probability pi. We require that the constraints pi ≥ 0 and $ \sum_{{i=1}}^{{k}}$pi = 1 are fulfilled: a probabilistic effect declares an exhaustive set of probability-weighted outcomes. We do, however, allow a probability-effect pair to be left out if the effect is empty. In other words,
(probabilistic p1 e1 ... pl el)
with $ \sum_{{i=1}}^{l}$pi < 1 is syntactic sugar for
(probabilistic p1 e1 ... pl el q (and))
with q = 1 −$ \sum_{{i=1}}^{l}$pi and (and) representing an empty effect (that is, no state changes). For example, the effect (probabilistic 0.9 (clogged)) means that with probability 0.9 the state variable clogged becomes true in the next state, while with probability 0.1 the state remains unchanged.

Figure 1 shows an encoding in PPDDL of the “Bomb and Toilet” example described by Kushmerick, Hanks, and Weld (1995). The requirements flag :probabilistic-effects signals that probabilistic effects are used in the domain definition. In this problem, there are two packages, one of which contains a bomb. The bomb can be defused by dunking the package containing the bomb in the toilet. There is a 0.05 probability of the toilet becoming clogged when a package is placed in it, thus rendering the goal state unreachable.

Figure 1: PPDDL encoding of “Bomb and Toilet” example.

(define (domain bomb-and-toilet)

(:requirements :conditional-effects :probabilistic-effects)
(:predicates (bomb-in-package ?pkg) (toilet-clogged)
(:action dunk-package
:parameters (?pkg)
:effect (and (when (bomb-in-package ?pkg)
(probabilistic 0.05 (toilet-clogged)))))

(define (problem bomb-and-toilet)
(:domain bomb-and-toilet)
(:requirements :negative-preconditions)
(:objects package1 package2)
(:init (probabilistic 0.5 (bomb-in-package package1)
0.5 (bomb-in-package package2)))
(:goal (and (bomb-defused) (not (toilet-clogged)))))

The problem definition in Figure 1 also shows that initial conditions in PPDDL can be probabilistic. In this particular example, we define two possible initial states with equal probability (0.5) of being the true initial state for any given execution. Table 1 lists the state variables for the “Bomb and Toilet” problem and their values in the two possible initial states. Intuitively, we can think of the initial conditions of a PPDDL planning problem as being the effects of an action forced to be scheduled right before time 0. Also, note that the goal of the problem involves negation, which is why the problem definition declares the :negative-preconditions requirements flag.

Table 1: State variables and their initial values for the “Bomb and Toilet” problem.
 Type Init 1 Init 2
bomb-in-packagepackage1 boolean true false
bomb-in-packagepackage2 boolean false true
toilet-clogged boolean false false
bomb-defused boolean false false

PPDDL allows arbitrary nesting of conditional and probabilistic effects (see example in Figure 2). This feature is in contrast to popular encodings, such as probabilistic STRIPS operators (PSOs; Kushmerik et al., 1995) and factored PSOs (Dearden & Boutilier, 1997), which do not allow conditional effects nested inside probabilistic effects. While arbitrary nesting does not add to the expressiveness of the language, it can allow for exponentially more compact representations of certain effects given the same set of state variables and actions (Rintanen, 2003). Any PPDDL action can, however, be translated into a set of PSOs with at most a polynomial increase in the size of the representation. Consequently, it follows from the results of Littman (1997) that PPDDL, after grounding (that is, full instantiation of action schemata), is representationally equivalent to dynamic Bayesian networks (Dean & Kanazawa), which is another popular representation for MDP planning problems.

Still, it is worth noting that a single PPDDL action schema can represent a large number of actions and a single predicate can represent a large number of state variables, meaning that PPDDL often can represent planning problems more succinctly than other representations. For example, the number of actions that can be represented using m objects and n action schemata with arity c is m . nc, which is not bounded by any polynomial in the size of the original representation (m + n). Grounding is by no means a prerequisite for PPDDL planning, so planners could conceivably take advantage of the more compact representation by working directly with action schemata.

Håkan L. S. Younes