Next: Solving the General NxN Up: Experimenting with MICRO-HILLARY Previous: The Effect of the

## Learning Macros in Other Domains

Table 12: The resources consumed while learning in various domains. Each number represents the mean over 100 learning sessions.
 Operator applications CPU seconds Problems Domain Mean Std. Mean Std. Mean Std. 24-puzzle 859,497 186,823 4842.0 2270.0 62.8 7.2 10-Cannibals 216,572 88,467 14.2 6.0 63.9 12.0 10-Stones 144,303 1,653 8.4 1.9 52.0 0.3 5-Hanoi 377,671 80,968 35.2 3.5 76.2 9.5 Grid 1,956,972 1,191,383 58.8 35.0 185.0 60.2

Table 13: Statistics of the macro sets generated in various domains. Each number represents the mean over 100 sets.
 Total number Mean Length Max Length Domain Mean Std. Mean Std. Mean Std. 24-puzzle 15.32 0.79 8.68 0.24 18.00 0.0 10-cannibals 2.16 0.39 2.93 0.15 3.98 0.2 10-stones 1.20 0.40 2.00 0.00 2.00 0.0 5-Hanoi 11.47 0.71 7.24 0.25 16.00 0.0 Grid 17.50 3.14 8.62 0.82 16.36 1.5

Table 14: The performance of MICRO-HILLARY (in operator applications) before and after learning in various domains.
 Before learning After learning Domain Mean Std. Mean Std. 24-puzzle 711,545 134,807.0 1540.0 57.5 10-cannibals 57 40.5 31.6 0.4 10-stones 206 84.8 126.6 8.1 5-Hanoi 10,993 14,623.3 156.0 9.4 Grid 1,529 1313 369.0 35.0

We have applied MICRO-HILLARY to the other domains specified in Section 4.1. Tables 12, 13 and 14 show the mean results for 100 learning sessions. MICRO-HILLARY was able to reach quiescence in all the domains. The 10-stones and 10-cannibals domains are very simple. One or two macros were sufficient to reach quiescence. Note that we used the same quiescence parameter, 50 problems, for all the domains. After solving each problem, MICRO-HILLARY increases by 100 the length of the random sequence used for generating a training problem. Therefore, MICRO-HILLARY spends 125,000 operator applications just to make sure that there is nothing new to learn. In the simple domains, this amounts to most of the resources used by MICRO-HILLARY.

It is interesting to look at the macros learned in the grid domain. Most of the macros have a structure of , where S stands for south, W for west, N for north, and S and N are equal in number. Such macros are used to make detours around walls that block the search.

MICRO-HILLARY was able to improve the performance of problem solving in each of the domains. The most notable improvement is in the 24-puzzle domain where the performance after learning is 462 times better than the performance before learning.

Next: Solving the General NxN Up: Experimenting with MICRO-HILLARY Previous: The Effect of the
Shaul Markovitch
1998-07-21