Experimenting with PARAMETRIC MICRO-HILLARY in the NxN Puzzle Domains

Next: Experimenting with PARAMETRIC MICRO-HILLARY Up: Solving the General NxN Previous: Using MICRO-HILLARY in Scalable

Experimenting with PARAMETRIC MICRO-HILLARY in the NxN Puzzle Domains

We have experimented with the PARAMETRIC MICRO-HILLARY algorithm in the NxN puzzle domains using the above specifications. The resulting macro sets were tested on random 10 x 10 puzzles.

Table 15: The upper part of the table lists the learning resources consumed by learning macros in the NxN puzzle domain. The middle part of the table contains the statistics of the generated macro sets. The lower part shows the performance of MICRO-HILLARY while using the macros to solve 100 random 10 x 10 puzzle problems. All the reported data is averaged over 100 learning sessions.

Learning Resources	Operator applications		CPU seconds		Problems
	Mean	Std.	Mean	Std.	Mean	Std.
	1084571	126338	270.7	62.7	232.8	13
Generated Macros	Total number		Mean Length		Max Length
	Mean	Std.	Mean	Std.	Mean	Std.
	14.87	0.83	8.82	0.23	18	0
Performance	Operator applications		CPU seconds		Solution Length
	Mean	Std.	Mean	Std.	Mean	Std.
	15891	1020	15.64	0.85	3028	268

Table 15 shows the summary of 100 learning sessions. MICRO-HILLARY learned most of its macros by solving 3 x 3 and 4 x 4 puzzles. It typically learned one or two macros while solving 5 x 5 puzzles, and reached quiescence while solving 6 x 6 problems without learning any new macros. Note that on average at least 85% of the training problems and at least half of the learning time (operator applications) were used just for testing quiescence. Although PARAMETRIC MICRO-HILLARY solved 232 training problems on average compared with an average of 60 problems solved by MICRO-HILLARY in the 15-puzzle domain, the total learning CPU time is almost 3 times shorter than the time spent for learning the 15-puzzle. This is an indication that the method of solving problems in increasing level of difficulty indeed saves learning resources. PARAMETRIC MICRO-HILLARY was able to learn part of the macros while solving 3 x 3 puzzles, thus saving time by escaping local minima in a domain with a lower complexity.

A second test was performed with one of the macro sets (we arbitrarily selected the one that was learned first). We generated sets of 10 random testing problems for various sizes up to 50 x 50 and solved these problems using the macro set. The random problems were generated, as in the previous section, by even random permutations.

**Figure 9:** Testing the macros learned by PARAMETRIC MICRO-HILLARY on large puzzles. The graphs show mean number of operator applications as a function of the puzzle size. The error bars are one standard deviation away on either side of the mean. The first graph shows the results for sizes up to 20. The second one shows the results up to size 50.

Figure 9 shows the mean number of operator applications as a function of the puzzle size. MICRO-HILLARY solved the whole test set in a reasonable time. It looks as if the program can now efficiently solve any NxN solvable problem. We have indeed succeeded in proving that the set of macros learned is complete and that MICRO-HILLARY can solve any solvable problem in O(N³) with a reasonable constant (288). The proof is given in the appendix. This proof is supported by our empirical testing. MICRO-HILLARY never resorted to the escape procedure throughout the whole testing phase. The proof is for a specific macro set, but can be repeated for the other macro sets as well. A variation of that proof shows that MICRO-HILLARY can solve any solvable problem in O(N³) even without learning. However, the constant in this upper limit is prohibitively high (10¹²). We do not claim that MICRO-HILLARY will always perform successful learning. It is possible that MICRO-HILLARY will reach quiescence and quit learning, having missed essential macros. The probability of such an event, however, diminishes with an increase in the size of the quiescence parameter.

Note that both the empirical results and the formal proof show that the maximal radius of any puzzle problem, regardless of the puzzle size, with respect to the RR heuristic, is bounded by 18. This bounded-radius property is a necessary condition for MICRO-HILLARY to scale up its acquired macros for domains with a larger parameter. It can be easily shown that the MD heuristic does not have such a property.

**Figure 10:** The top graph shows the solution length (in basic moves) as a function of the puzzle size. The bottom graph shows the ratio between the solution length and the heuristic lower-bound on the shortest solution as a function of the puzzle size. In both graphs the error bars are one standard deviation away on either side of the mean.

Figure 10(a) shows the mean solution length as a function of the puzzle size. Figure 10(b) shows the upper bound on the ratio of the solution length to the optimal solution length as a function of the puzzle size (the solution length divided by the sum-of-Manhattan-distances). It is interesting to note that the graph is flattened at a value of 5. Ratner and Warmuth [30,31] sketch a (quite complicated) hand-crafted approximation algorithm for the NxN puzzle. In their earlier paper they give a constant factor of 22 to the approximation. Looking at Figure 10, it is quite possible that MICRO-HILLARY could have found such an algorithm by itself. The constant factor of MICRO-HILLARY seems to be close to 5, but it is measured with respect to random problems, while Ratner and Warmuth's constant is a worst-case upper bound. In the appendix, we prove that the length of the solutions found by MICRO-HILLARY for NxN puzzle problems is O(N³) with a constant of 50 (which is substantially worse than Ratner and Warmuth's).

Next: Experimenting with PARAMETRIC MICRO-HILLARY Up: Solving the General NxN Previous: Using MICRO-HILLARY in Scalable

Shaul Markovitch
1998-07-21