Problem Domains

Next: The Competitors Up: Overview: The Third International Previous: Overview: The Third International

Problem Domains

The problem domains selected for the competitions have been, or have become, benchmark domains used by much of the community for empirical evaluation. The domains that have been used have often been chosen to probe some specific detail of performance. This has sometimes meant that the domains are not representative of general features of planning and are inappropriate for use in more widespread testing. A description of the domains used in all of the competitions so far can be found in Appendix A.

In the third competition, eight families of domains were used, broadly divided into transportation domains (Depots, DriverLog and ZenoTravel), applications-inspired domains (Rovers and Satellite) and a small collection of others (Settlers, FreeCell and UM-Translog-2).

We briefly summarise the collection here and describe them in more detail in Appendix A.

Depots This domain combines the transportation style problem of Logistics with the well-known Blocks domain. The Logistics domain exhibits a high degree of parallelism, since separate vehicles can be utilised concurrently. In contrast, the Blocks domain is characterised by significant goal interaction. Our intention in doing this was to discover whether the successes of planners in each of these domains separately could be brought together when the problems were combined.
DriverLog This problem involves transportation, but with the twist that the vehicles must be supplied with a driver before they can move.
Zeno-travel Another transportation problem, inspired by a domain used in testing the ZENO planner developed by Pemberthy and Weld zeno, in which people must embark onto planes, fly between locations and then debark, with planes consuming fuel at different rates according to speed of travel.
Satellite This domain was inspired by the problem of scheduling satellite observations. The problems involve satellites collecting and storing data using different instruments to observe a selection of targets.
Rovers This domain was motivated by the 2003 Mars Exploration Rover (MER) missions and the planned 2009 Mars Science Laboratory (MSL) mission. The objective is to use a collection of mobile rovers to traverse between waypoints on the planet, carrying out a variety of data-collection missions and transmitting data back to a lander. The problem includes constraints on the visibility of the lander from various locations and on the ability of individual rovers to traverse between particular pairs of waypoints.
Settlers This domain revolves around the management of resources, measured using metric valued variables. Products must be manufactured from raw materials and used in the manufacture or transportation of further materials. New raw materials can be generated by mining or gathering. The objective is to construct a variety of structures at various specified locations.
UM-Translog-2 This domain is a PDDL2.1 encoding of a new variant of the UM-Translog [Wu NauWu Nau2002] domain. This was generated for us by Dan Wu of the University of Maryland. This is essentially a transportation domain, but one that is significantly more complex than previous transportation benchmarks. In fact, this domain was introduced late in the competition and very little data was collected. It is therefore not discussed further in this paper.

We also reused the Freecell domain from the second competition. This domain presented a serious challenge to participants in 2000 and we were interested to see whether planning technology had surpassed this challenge in the intervening two years. Although the domain produced some interesting data we did not attempt to precisely measure the extent to which the 2002 performance surpassed that of 2000.

Each domain (other than Settlers, Freecell and UM-Translog-2) was presented to the competitors for at least the four different levels previously identified: STRIPS, NUMERIC SIMPLETIME and TIME. The problems presented at each of these levels comprised distinct tracks and the competitors were able to choose in which tracks they wished to compete. In addition to the four main tracks we also included two additional tracks, intended to explore particular ideas. These tracks did not necessitate the use of additional expressive power but simply allowed existing expressiveness to be combined to produce interesting planning challenges. For example, the HARDNUMERIC track consisted of problems from the Satellite domain that had very few logical goals. Plans were evaluated by a metric based on amount of data recorded rather than by determining whether a specified logical goal had been achieved. The challenge was for planners to respond to the plan metric and include actions that would acquire data. The COMPLEX track consisted of problems that combined temporal and numeric features. The challenge was to reason about resource consumption in parallel with managing temporal constraints. In total, we defveloped 26 domains, with 20 problem instances in each domain (a few, unintentially, ended up with 16 or 22 instances). In most domains there were an additional 20 instances of large problems intended for the hand-coded planners. In total there were nearly 1000 problem instances to be solved, of which about half were intended primarily for the fully-automated planners.

Next: The Competitors Up: Overview: The Third International Previous: Overview: The Third International

Derek Long 2003-11-06