The Tileworld Problems

Next: The Trains and Get-Paid Up: The Need for Domain Previous: The Need for Domain

The Tileworld Problems

The Tileworld domain involves a grid with tiles and holes, and the goal is to fill each hole with a tile. This goal can be achieved with a fill operator, which has two preconditions: the agent must be at the hole, and it must be holding a tile. In our encoding, an agent can hold up to four tiles at a time. The go operator is used to achieve the (sub)goal of being at a hole, while the pickup operator is used to achieve the (sub)goal of holding a tile. In the normal way, go has a precondition of being at some location, namely whatever location the agent will move from. Pickup has a precondition of being at the location of some tile. The problems in the Tileworld problem set differ from one another in the number of holes that the agent must fill: each problem adds another hole.

Figures gives the log-log plot for the various strategies on the Tileworld problems, when the preconditions were entered in the default order. Note that LCFR (S+OC+UC) is the strategy highlighted. Three other strategies were almost indistinguishable from LCFR (S+OC+UC), namely, LCFR (S+OC+.1UC+F), DUnf-Gen (S+OC+UC) and DUnf-Gen(S+OC+.1UC+F). All the other strategies performed worse. This can more easily be seen in Figure , which gives the aggregate performance for the leading strategies: those that were able to solve all seven Tileworld problems. In fact, these leading strategies were able to solve the seven Tileworld problems without generating more than 1800 nodes for any problem. In contrast, the remaining strategies failed on at least one, and up to four, of the seven problems, given the limit of 100,000 nodes generated.

Figure: Tileworld Problems: Node Counts for All Strategies

Figure: Tileworld Problems: Aggregate Performance for Leading Strategies

What was originally surprising to us is that on the Tileworld problems, delaying separable threats actually seems to hurt performance. The strategies that did best were those like LCFR and DUnf-Gen that do not delay separable threats. LCFR-DSep, ZLIFO, DSep-LC, and DSep all generated larger search spaces, in contrast to what we would have predicted given the experiments on the basic problem set.

To understand this result, we looked in detail at the planning trace for these problems. What that revealed is that for the Tileworld domain, the early resolution of separable threats has an important advantage: it imposes what turns out to be the correct temporal ordering between the steps of going to up a tile (to pick it up), and carrying it to a hole. Virtually all the strategies create subplans like the one shown in Figure . The goals involve filling holes, so the planners insert steps to go to and pick up a tile, and to go to the hole. At this point, there are two separable threats: (1) the effect of going to the hole, (¬at(X)), threatens the link between going to the tile and picking it up (at(Z)), and (2) the effect of going to the tile, (¬at(W)), threatens the link between going to the hole and filling it (at(Y)). Both threats are separable, because X and W will be unbound; the planner does not yet know where it will be traveling from. But there is only one valid temporal ordering that will resolve these threats: going to the tile must precede picking up the tile, which in turn must precede going to the hole. Once this temporal ordering is determined, further planning goes smoothly.

Figure: Typical Partial Plan for the Tileworld Domain

In contrast, if this ordering decision is not made, the planner can often ``get lost'', attempting to find plans in which it goes from some location to the hole and then from the hole to the tile. There are many ways to attempt this, because there are many different tiles to select, and many different locations to move among. The planner may try many of these alternatives before determining that there is a fundamental inconsistency in these plans, and that they are destined to fail. The larger the number of holes to be filled, the worse the situation becomes.

Sometimes the planner may make the right decision about temporal ordering even if it has deferred separable threats. When faced with the partial plan in Figure , if the planner does not select a threat, it will select from among several open conditions. It can attempt to establish the precondition of going to the hole (at(X)) by reusing the effect of going to the tile (at(Z)), or it can do the reverse, and attempt to establish the precondition of going to the tile (at(W)) by reusing the effect of going to the hole (at(X)). Of course, the first solution is the right one, and includes the critical temporal ordering constraint, while the second will eventually fail.

The order in which the open conditions are selected will determine which of these two choices the planner makes. When preconditions are entered in the default order, planners that delay separable threats end up making the latter, problematic choice. In contrast, when the preconditions are entered in the reverse order, the planners make what turns out to be the correct choice. Thus, for the experiments in which we reversed precondition insertion, we see a different pattern of performance, as shown in Figures -.

Figure: Tileworld Problems: Node Counts with Reversed Precondition Insertion

Figure: Tileworld Problems: Aggregate Performance for all Strategies with Reversed Precondition Insertion

When the preconditions are entered in the reverse order, a larger number of strategies perform well, solving all the problems. In particular, with S+OC+.1UC+F node-selection, the performance of LCFR, DUnf-Gen, ZLIFO, and LCFR-DSep is virtually indistinguishable. It is important to note that the leading strategies that do not delay separable threats--LCFR and DUnf-Gen--are not affected much by the reversal of precondition insertion for the Tileworld problems; in fact, LCFR's performance is identical in both cases. In contrast, the strategies that use separable-threat delay--LCFR-DSep, ZLIFO, and DSep-LC--all perform much better when we reverse precondition insertion. This is explained by our analysis above.

In sum, what is most important for the Tileworld domain is for the planner to recognize, as early as possible, that there are certain required temporal orderings between some of the steps in any successful plan. Every successful plan will involve going to a tile before going to a hole, although there is flexibility in the order in which multiple holes are visited, and in the interleaving of picking up tiles and dropping them in holes. For the strategies we studied, there were two different methods that led to this temporal constraint being added to the plan. It was added when the planner selected a separable threat to resolve, and it was added when it selected one particular precondition to resolve before another.

Next: The Trains and Get-Paid Up: The Need for Domain Previous: The Need for Domain

TOM Conversion
Wed Jun 25 17:40:58 EDT 1997