Comments on Ranking

The frequency-based ranking method used with CA-ED is simple, fast and was shown to produce useful macros. Part of its success is due to the combination with static pruning rules. In particular, limiting macro length to only two actions simplifies the problem of macro ranking and filtering.

However, in the general case, the savings that a macro can achieve depend not only on how often it occurs as part of a solution, but also on several other factors, which can interact. Examples of such factors include the number of search nodes that the application of a macro would save, and the ratio of useful instantiations of a macro (providing shortcuts towards a goal state) versus instantiations that guide the search into a wrong direction. See the work of McCluskey and Porteous [21] for more details on factors that determine the performance of macro-operators in AI planning.

The reason why we have extended our approach from frequency-based ranking to gradient-descent ranking is that integrating such factors as above into a ranking method is expected to produce more accurate results. Compared to frequency-based ranking, gradient-descent ranking measures the search performance of a macro more directly. To illustrate this, consider the solution plan in Figure 12. Table 2 shows the 5 distinct macro-operators extracted from this solution plan. For each macro, both the gradient-descent weight and the frequency-based weight are shown. In the latter case, the bonus points are ignored, since they do not affect the ranking (all macros will receive the same amount of bonus points for being part of this solution plan). Each method produces a different ranking. For example, macro TAKE-IMAGE TURN-TO is ranked fourth with the gradient-descent method and second with the frequency-based method. The reason is that a macro such as TURN-TO CALIBRATE (or SWITCH-ON TURN-TO) saves more search nodes than TAKE-IMAGE TURN-TO, even though it appears less frequently in the solution.

Table 2: Macros generated in the Satellite example.

Macro	Weight	Occurrences
TURN-TO TAKE-IMAGE	0.999103	3
TURN-TO CALIBRATE	0.999103	1
SWITCH-ON TURN-TO	0.999103	1
TAKE-IMAGE TURN-TO	0.999401	2
CALIBRATE TURN-TO	0.999700	1

When compared to the simple and fast frequency-based method, gradient-descent ranking is more expensive. Each training problem has to be solved several times; once with no macros in use and once for each macro. As shown in Table 3 in Section 5, the training time can become an issue in domains such as PSR and Pipesworld Non-Temporal Tankage. Both ranking techniques ignore elements such as the interactions of several macros when used simultaneously, or the effects of macros on the quality of plans. See Section 5 for an evaluation of the latter.

Macro ranking is a difficult problem. The training data is often limited. In addition, factors such as frequency, number of search nodes that a macro would save, effects on solution quality, etc. have to be combined into a total ordering of a macro set. There is no clear best solution for this problem. For example, should we select a macro that speeds up planning but increases the solution length?