Heuristic Flaw Selection.

Distance based heuristics have been used extensively for ranking plans in state space planners (e.g., HSP and FF). Nguyen and Kambhampati (2001) show that these heuristics can be very useful for ranking plans in POCL planners as well. They also suggest that the same heuristics could be used in flaw selection methods, but do not elaborate further on this subject.

It is not hard to see, however, how many of the plan rank heuristics could be used for the purpose of selecting among open conditions, since they often are based on estimating the cost of achieving open conditions as seen in Section 3.1.1. By giving priority to open conditions with the highest heuristic cost, we can build plans in a top-down manner from the goals to the initial conditions. We call this ordering criterion ``MC'' (most cost first). By using the opposite ordering criterion, ``LC'', we would instead tend to build plans in a bottom-up manner. Note that these two ordering criteria only apply to open conditions, and not to threats, so we would need to use them in combination with selection criteria for threats. We can define both global and local heuristic flaw selection strategies:

MC	{n, s}LR / {o}MC_add
MC-Loc	{n, s}LR / {l}MC_add

The subscript for MC indicates the heuristic function to use for ranking open conditions, which in this case is the additive heuristic.

In Section 3.1.3, we proposed that we can estimate the remaining planning effort for an open condition by counting the total number of open conditions that would arise while resolving the open condition. This heuristic could also be useful for ranking open conditions, and is often more discriminating than an ordering criterion based on heuristic cost. We therefore define two additional ordering criteria: ``MW'' (most work first) and ``LW''. With these, we can define additional flaw selection strategies:

MW	{n, s}LR / {o}MW_add
MW-Loc	{n, s}LR / {l}MW_add

For the planning problems listed in Table 5, we can see that MW-Loc is at most as sensitive to precondition order as MC-Loc, with MW-Loc never performing worse than MC-Loc and for the first two problems performing clearly better.

IXTET (Ghallab & Laruelle, 1994; Laborie & Ghallab, 1995) also uses heuristic techniques to guide flaw selection, but in quite a different way than suggested here. It is our understanding of the IXTET heuristic that it estimates, for each possible refinement r resolving a flaw, the amount of change (commitment) that would result from applying r to the current plan. For open conditions, this estimate is obtained by expanding a tree of subgoal decomposition, which in principal is a regression-match graph (McDermott, 1999). This is similar to how heuristic values are computed using the additive heuristic. However, IXTET considers not only the number of actions that need to be added to resolve an open condition but also to what degree current variable domains would be reduced and possible action orderings restricted. Furthermore, IXTET uses the heuristic values to choose the flaw in which a single refinement stands out the most as the least ``costly'' compared to other refinements for the same flaw. The intended effect is a reduction in the amount of backtracking that is needed to find a solution, although we are not aware of any evaluation of the effectiveness of the technique.

Håkan L. S. Younes
2003-08-26