Conflict-Driven Flaw Selection.

Common wisdom in implementing search heuristics for constraint satisfaction problems, e.g. propositional satisfiability, is to first make decisions with maximal consequences, so that inconsistencies can be detected early on, pruning large parts of the search space.

A flaw selection strategy that follows this principle would be to link unsafe open conditions before other open conditions. We call an open condition unsafe if a causal link to that open condition would be threatened. By giving priority to unsafe open conditions, the planner will direct attention to possible conflicts/inconsistencies in the plan at an early stage. We introduce the flaw type ``u'' representing unsafe open conditions. Examples of conflict-driven flaw selection strategies using this new flaw type are the following variations of LCFR, LCFR-Loc, and MW-Loc:

LCFR-Conf     {n, s, u}LR  /  {o}LR    
LCFR-Loc-Conf     {n, s, u}LR  /  {l}LR    
MW-Loc-Conf     {n, s}LR  /  {u}MWadd  /  {l}MWadd    

The first two of these conflict-driven strategies are very effective in the link-chain domain constructed by Veloso and Blythe [29]. The link-chain domain is an artificial domain specifically constructed to demonstrate the weakness of POCL planners in certain domains. What makes the domain hard for SNLP and UCPOP with their default flaw selection strategies is that open conditions can be achieved by several actions but with only one action being the right choice because of negative interaction. This forces the POCL planner to backtrack excessively over link commitments, but inconsistencies may not be immediately detected because of the many link alternatives. We can see in Figure 1 that VHPOP using the UCPOP flaw selection strategy performs very poorly in the link-chain domain. Using a more sophisticated flaw selection strategy such as LCFR improves performance somewhat. However, with the two conflict-driven flaw selection strategies all problems are solved in less than a second. The number of generated and explored plans is in fact identical for LCFR-Conf and LCFR-Loc-Conf, but LCFR-Loc-Conf is roughly twice as fast as LCFR-Conf because of reduced overhead. This demonstrates the benefit of local flaw selection strategies. Note, however, that LCFR is faster than LCFR-Loc in the link-chain domain, so local strategies are not always superior to global strategies.

Figure: Average planning time over ten problems for each point with different flaw selection strategies in the link-chain domain. Results are shown for VHPOP using five different flaw selection strategies. Only points for which a strategy solved all ten problems without running out of memory (512Mb) are shown. The hf heuristic was used to rank plans.

We can also see in Table 1 that conflict-driven flaw selection strategies work well in the DriverLog and Depots domains, both with hadd and haddr as heuristic function for ranking plans.

Håkan L. S. Younes