As in FF, Marvin resorts to best-first search if EHC proves unable to find a solution. This approach maintains the completeness of the planner in the cases where the use of EHC with helpful actions would otherwise render the search incomplete. Two other planners in IPC 4 used variations on the best-first search algorithm, YAHSP [Vidal 2004] and Fast-Downward [Helmert 2006]. Unlike Marvin, however, in these two planners there is no incomplete search step (such as EHC) before using a modified best-first search algorithm. In YAHSP, conventional WA* search is used but within the search queue, states reached by applying a helpful action in their parent state are ordered before those which were not. In Fast-Downward, a `deferred heuristic evaluation' strategy is used, where states are inserted into the search queue with the heuristic value of their parent state; the actual heuristic cost of the state is then only calculated when the state is expanded.

In Marvin the best-first strategy is modified by greedily expanding the first successor found with a better heuristic than its parent state, but retaining the parent so that its remaining children can be evaluated later if necessary. The effect of this is similar to the approach taken in Fast-Downward, and would lead to the nodes in the search space being visited in the same order. The approach taken in Marvin, however, allows a smaller search queue to be maintained, as nodes are not necessarily inserted into the search queue for each successor node reachable from a state.

Whenever a successor state is generated and evaluated (by calculating its heuristic value), one of two things happens:

- If the successor has a heuristic better than its parent, the successor is placed at the front of the search queue, with its parent state behind it (along with a counter variable, noting how many successors have already been evaluated); and the search loop is then restarted from the successor state.
- If the successor has a heuristic no better than its parent, the successor is inserted into the search queue in its appropriate place (stable priority-queue insertion, ordered by heuristic value). The process then carries on evaluating the successors of the parent state.

The pseudo-code for this can be seen in Figure 3. The approach is inspired by the idea of taking the first strictly-better successor when performing EHC search, with the benefit that the number of heuristic evaluations to be performed is potentially reduced by considering fewer successors to each state. It differs from EHC in that, to maintain completeness, the parent state is not discarded--it is placed back in the queue to have its other successors evaluated later if necessary. Theoretically, if EHC search on a given problem does not encounter any plateaux, and any pruning from selecting only the helpful actions is ignored, then using greedy best-first search on that problem would visit the same number of nodes and evaluate the same number of successors. If a plateau was encountered, however, the search behaviour would differ as EHC would only consider states reachable from the state at the start of the plateau.

Another effect of the greedy best-first search is that the search focusses on exploring in a given direction. As has been described, as soon as a successor node is found with a heuristic value better than that of its parent, then the further expansion of the parent node is postponed and the successor node is expanded. The practical consequence of this is that as the search queue does not contain the other equally good successors, any search forward from a successor state will not be sidetracked by also having to search forward from its sibling states. The parent node will be re-visited, and the other sibling nodes added, but only if it proves heuristically wise to do so--that is, if searching forward from the successor node is not making heuristic progress.

Andrew Coles and Amanda Smith 2007-01-09