Introduction

One of the currently most successful approaches to domain-independent planning is forward-chaining heuristic search through the problem state space. Search is guided by a heuristic function based on an appropriate relaxation of the planning problem. Different relaxations have been explored [Bonet & Geffner 2000, McDermott 1996, Hoffmann & Nebel 2001, Helmert 2004] and have been shown to result in more or less informative heuristic functions. A common relaxation is to ignore the delete lists of actions in the problem domain, resulting in an abstracted problem domain comprised of relaxed actions. A given state can then be evaluated by counting the number of relaxed actions needed to reach the goal state from the given state. Hoffmann and Nebel (2001) present a search strategy called Enforced Hill-Climbing (EHC) which, coupled with a relaxation of this kind, has been proven empirically to be an effective strategy in many planning domains. Their planner, FF, performed with great success in the Second and Third International Planning Competitions [Bacchus 2001, Long & Fox 2003]. In this paper we present our planner, Marvin, which builds upon this search approach.

The EHC search strategy performs gradient-descent local search, using breadth-first search to find action sequences leading to strictly-better states should no single-action step be able to reach one. This embedded exhaustive-search step is one of the bottlenecks in planning with this approach. We present an approach that, through memoising the plateau-escaping action sequences discovered during search, can form macro-actions which can be applied later when plateaux are once-again encountered. In doing so, the planner can escape from similar plateaux encountered later, without expensive exhaustive search. The resulting planner is called Marvin.

We begin this paper with a brief review of FF's search behaviour to provide the background for our approach. We then introduce the main features of Marvin, explaining how its search behaviour differs from that of FF. We describe the three main contributions made by Marvin, detailing the key algorithms and their effects on performance. Marvin can plan in STRIPS and ADL domains, and it can also handle the derived predicates of PDDL2.2. We describe the way in which domains containing derived predicates and ADL are handled without first being reduced to STRIPS domains. Finally, we discuss the results obtained by Marvin in the Fourth International Planning Competition (IPC 4) [Hoffmann & Edelkamp 2005], and provide additional ablation studies to assess the impact of its various features on planning performance across a selection of domains.

Andrew Coles and Amanda Smith 2007-01-09