To evaluate the performance of our two abstraction methods, we compare four setups of MACROFF. In all four setups, the planner includes the implementation enhancements described in Section 4. Setup 1 is the planner with no macros. Setup 2 includes CAED, the method described in Section 2. Setup 3 uses SOLEP, the method described in Section 3. Setup 4 is a combination of 2 and 3. Since both methods have benefits and limitations, it is interesting to analyze how they perform when applied together. In setup 4, we first run CAED, obtaining an enhanced domain. Next we treat this as a regular domain, and apply SOLEP to generate a list of SOLEP macros. Finally, the enhanced planner uses as input the enhanced domain, the list of SOLEP macros, and regular problem instances.
Since component abstraction can currently be applied only to STRIPS domains with static facts in their formulation, we used as testbeds domains that satisfy these constraints. We ran this experiment on Rovers (20 problems), Depots (22 problems), and Satellite (36 problems). These domains were used in the third international planning competition IPC3, and Satellite was reused in IPC4 with an extended problem set. In our experiments, the Rovers and Depots problems sets are the same as in IPC3, and the Satellite problem set is the same as in IPC4.
Figures 20  23 and Table 4 summarize our results. The performance consistently improves when macros are used. Interestingly, combining CAED and SOLEP often leads to better performance than each abstraction method taken separately. In Rovers, all three abstraction setups produce quite similar results, with a slight plus for the combined setup. In Depots, CAED is more effective than SOLEP in terms of expanded nodes. The differences in CPU time become smaller, since adding new operators to the original domain significantly increases the cost per node in Depots (see the discussion below). Again, the overall winner in this domain is the combined setup. In Satellite, adding macros to the domain reduces the number of expanded nodes, but has significant impact in cost per node (see Table 4 later in this section) and memory requirements. Setups 2 and 4, which add macros to the original domain, fail to solve three problems (32, 33, and 36) because of the large memory requirements in FF's preprocessing phase.
Table 4 evaluates how macros can affect the cost per node in the search. The cost per node is defined as the total search time divided by the number of evaluated states. A value in the table is the cost per node in the corresponding setup (i.e., CAED or SOLEP) divided by the cost per node in the setup with no macros. For each of the two methods we show the minimum, the maximum, and the average value. When macros are added to the original domain (i.e., the domain is enhanced), the increase in cost per node can be significant. The average rate is 7.70 in Satellite, and 6.06 in Depots. It is interesting to notice that this cost is less than 1 in Rovers. This is an effect of solving a problem with less nodes expanded. Operations such as managing the open list and the closed list have costs that increase with the size of the lists at a rate that can be higher than linear. The right part of the table shows much better values for the cost rate when macros are used in an enhanced planner.
It is important to analyze why macros added as new operators generate such an increase in cost per node in Satellite and Depots. The overhead is mostly present in the relaxed graphplan algorithm that computes the heuristic value of a state. The complexity of this algorithm depends upon the total number of actions that have been instantiated during preprocessing for a given problem. Adding new operators to a domain increases the number of preinstantiated actions. Since macros tend to have more variables than a regular operator, the corresponding number of instantiations can be significantly larger. Let the action instantiation rate be the number of actions instantiated for a problem when macros are used divided by the number of actions instantiated in the original domain formulation. Our statistics show that the average action instantiation rate is 6.03 in Satellite, 3.20 in Depots, and 1.04 in Rovers.
The results show no significant impact of macrooperators on the solution quality. When macros are used, the length of a plan slightly varies in both directions, with an average close to the value of the original FF system.


