The Fourth International Planning Competition (IPC-4) was held as part of the International Conference on Planning and Scheduling (ICAPS'04) in Vancouver, British Columbia in June 2004. By request of the ICAPS'04 organizers, Sven Koenig and Shlomo Zilberstein, we were asked to create the first probabilistic planning track as part of IPC-4.
The overriding goal of the first probabilistic planning track was to bring together two communities converging on a similar set of research issues and aid them in creating comparable tools and evaluation metrics. One community consists of Markov decision process (MDP) researchers interested in developing algorithms that apply to powerfully expressive representations of environments. The other consists of planning researchers incorporating probabilistic and decision theoretic concepts into their planning algorithms. Cross fertilization has begun, but the intent of the probabilistic planning track was to create a set of shared benchmarks and metrics that could crystallize efforts in this area of study.
We created a new domain description language called PPDDL1.0, described in Section 2. PPDDL stands for “Probabilistic Planning Domain Definition Language”, in analogy to PDDL (McDermott, 2000), which was introduced in IPC-1. PPDDL is modeled on PDDL2.1 (Fox & Long, 2003), the domain-description language for deterministic domains used in IPC-3. Syntactically, this language has a STRIPS/ADL-like flavor, but includes probabilistic constructs. To focus the energy of participants on issues of dealing with uncertainty, we chose not to include constructs for durative actions in PPDDL1.0.
By basing the domain-description language on PDDL, we sought to remain in the spirit of the existing planning competition, which we hope will further bring the communities together. The PPDDL representation itself is relational. Although representations with explicit objects are not a traditional feature of MDP-based domain-description languages, algorithms that exploit these features have begun to appear. We expected participants to propositionalize the domains before running their planning algorithms and, for the most part, they did so.
A fully functional parser for PPDDL was provided to participants in C++ in the form of a plan validator and very simple planner. Some basic tools to convert PPDDL to a decision-diagram representation were also provided. In many ways, handling the rich constructs of PPDDL was the main hurdle for many participants and we tried to provide as much assistance as we could on this dimension.
Although PPDDL supports numerical fluents, this feature was not used to its fullest extent in the competition. Numerical quantities were used only for representing reward values, and reward effects were required to be additive.
Since the classical track is well established at this point, it is helpful to contrast how the probabilistic track differs. The defining difference, of course, is that actions can have uncertain effects. That is, a “pickup” action in a Blocksworld might behave differently on different occasions, even from the same state. This single difference has a number of significant consequences. First, the optimal action choices for reaching a goal may be a function of the probabilistic outcomes along the way—a single sequence of actions may not be sufficient. As a result, it can be difficult to output a “plan”. For this reason, we decided not to separate plan synthesis and execution into two phases, but instead evaluated planners online. Second, because of the unpredictability of effects, even an optimal plan for reaching a goal may get “unlucky” and fail with some probability. For this reason, we evaluated each planner multiple times on each problem and did not include a separate “optimal planner” track. In addition, since some planners may fail to reach the goal state for some executions, we needed a way of trading off between goal attainment and action cost. We decided to score an execution as goal reward minus action cost and chose a goal reward for each problem. Section 3 describes the evaluation methodology in further detail.
In total, we designed eight domains for the competition (Section 4). Some domains were simply noisy versions of classical planning domains, while other domains were designed specifically to thwart greedy replanning approaches that ignore uncertainty.
Ten planners from seven different groups entered the competition. The results of the competition are presented in Section 5. A deterministic replanner performed best overall, primarily due to a disproportionate number of noisy classical planning problems in the evaluation suite. Some domains proved challenging for all participating planners. These latter domains could serve as a basis for future probabilistic planning competitions.
Håkan L. S. Younes