We tested PARAMETRIC MICRO-HILLARY in other parameterized domains. In the N-Cannibals and N-Stones domains, PARAMETRIC MICRO-HILLARY learned all the macros with the minimal value of the parameter (3). The test was performed using problems with a parameter of 20 in both domains. MICRO-HILLARY's performance indeed improved in both domains and problem solving proceeded without encountering local minima.
The N-Hanoi domain family is recursive in nature and we did not expect MICRO-HILLARY to find a complete macro set for these domains. The length of the macros should grow with the number of rings; therefore, MICRO-HILLARY should not reach quiescence in these domains. We were surprised to find that PARAMETRIC MICRO-HILLARY achieved quiescence after solving problems of 7 or 8 rings (7.13 on average). This error was caused by the domain-independent training-problems generator. The probability that the largest ring will be moved from its target location after a random sequence of moves is very low. Indeed, when we increased the length of the sequences used for generating training problems and increased the quiescence parameter, learning continued, but MICRO-HILLARY still reached quiescence after solving problems with 9 rings. In both cases, the macros learned were not sufficient for solving problems with a parameter that is larger than the values encountered during training. The test was performed with problems of 6 rings. For solving domain families such as N-Hanoi, we should extend MICRO-HILLARY and endow it with the capability of generating recursive macros. To avoid the problem of PARAMETRIC MICRO-HILLARY quitting prematurely, we can modify the problem generator of MICRO-HILLARY to use random sequences with lengths that are based on the domain parameter. Alternatively, we can use domain-specific problem generators. The results of this experiment are summarized in Tables 16, 17 and 18.
|
|
|
Another related experiment involved transfer of knowledge between two similar domains (but not parameterized as the domains above). We generated a random grid, different from the one used for the experiments above, and performed a testing session with MICRO-HILLARY, using macros that were learned in the first grid. Using the macros improved MICRO-HILLARY's performance--from 1529 operator applications without macros down to 594 with macros. This ability of transferring skill from one grid to another arises from the similar shape of obstacles. A macro such as SSSSSWNNNNN can be helpful in getting around walls of various sizes in the original grid used for learning and in the grid used for testing.