next up previous
Next: Appendix B. The Efficiency Up: A Selective Macro-learning Algorithm Previous: Discussion

Appendix A. Applying PARAMETRIC MICRO-HILLARY to the NxN Puzzle

To show a more concrete example of defining the parameters for PARAMETRIC MICRO-HILLARY we list here the exact definitions used to apply the algorithm to the NxN puzzle domains. The domain parameter used is N with an initial value of 3.

Figure 11: An example for the notations used for the NxN puzzle.

To make the specifications more concise, we use the following definitions and notations. A puzzle state is a permutation of the sequence $\left\langle 0, 1, \ldots , N^2-1\right\rangle$. Each of the elements of the puzzle state is called a tile. 0 is called the empty tile. For convenience, we will represent a puzzle state by an NxN row-major matrix. Let s be a state, let $i,j \leq N$. We define tiles(i,j) to be the tile located in row i column j of s, where s is represented by a row-major N x N matrix. For every state s and tile $t \in s$, we define the tile location locs(t)=(i,j) where tiles(i,j)=t. Let l1=(i1,j1) and l2=(i2,j2) be two locations. The distance between l1 and l2 is defined as d(l1,l2)=|i1 - i2|+|j1 - j2|. Let $s=\left\langle t_1, \ldots ,t_{N^2} \right\rangle$ be a state. Let $s_g=\left\langle g_1, \ldots ,g_{N^2-1}, 0 \right\rangle$ be the goal state. The number of placed tiles is the largest p such that ti=gi for all $i \leq p$. The expression of p+1 in the matrix notation is called the next location and is marked as NextLoc(s). gp is called the next tile and is marked as NextTile(s). Figure 11 shows examples for the above notations. The heuristic function is the one described in the beginning of Section 4, generalized to the NxN puzzle. It is a linear combination of three factors. The weights were chosen to be high enough to enforce a lexicographic order. The least significant factor is the Manhattan distance between the empty tile and the next tile. Since it is bounded above by 2(N-1), the second factor is multiplied by 2N to make sure that it is dominant. For the same reason, we multiply the most significant factor by 4N2.

Table 19: The definitions of the parameters that were used to apply MICRO-HILLARY to the NxN puzzle domains.
Domain parameter N
Initial value 3
Generate-goal Let $t_1, \ldots ,t_{N^2-1}$ be a random permutation of $1,\ldots,N^2-1$. Generate-goal returns $\left\langle t_1, \ldots ,t_{N^2-1}, 0
Basic-operators $\left\{ u, d, l, r\right\}$. Let $loc_s(0)=\left(i_0,j_0\right)$. d(s) is defined as:

... & i=i_0+1,j=j_0\\
tile_s(i,j) & otherwise

d(s) is undefined for i0 = N - 1. u,l,r are defined similarly.
Heuristic-function $h(s,s_g)=4 \cdot N^2 \left ( N^2 - placed(s,s_g) \right ) +
2 \cdot N \cdot d
( NextLoc(s) ,$ locs(NextTile(s)) )+ d ( locs ( 0), locs(NextTile(s)) )

Table 19 lists the exact parameters used for MICRO-HILLARY in the NxN puzzle domain.

next up previous
Next: Appendix B. The Efficiency Up: A Selective Macro-learning Algorithm Previous: Discussion
Shaul Markovitch