Next: Learning to Solve Problems Up: Experimental Methodology Previous: Dependent Variables

### Independent Variables

There are many independent variables that affect the performance of the learning system. For the experiments described here we look at the following variables:
• Selection method:
• Minimum-to-better: The main selection method defined by Equation 3.
• Minimum-to-minimum: Our implementation of MACLEARN's selection method.
• Any-to-better: Our implementation of MORRIS's selection method.
• Heuristic function:
• RR: The heuristic function described above.
• Row-by-row-2 : The RR heuristic without the third component.
• Reduction : A variation of the RR heuristic that leads to placing of tiles in the first row, then the last column, then the second row, then the second to last column etc. This heuristic was suggested by Korf [16].
• Spiral: A variation of the RR heuristic that leads to placing the tiles in a spiral order from outside in.
• Sum-of-Manhattan-distances : The sum of the Manhattan distances between each tile and its destination (will be denoted by MD).
• Domain: Most of the experiments described here were performed in the 15-puzzle domain. However, we have tried MICRO-HILLARY in several other domains. In all the domains, the initial states are generated using the domain-independent training-problem generator which applies a random sequence of operators to the goal state.
• 24-puzzle: 5 x 5 sliding tile puzzle.
• 10-cannibals: The cannibals and missionaries problem [28]. To make it non-trivial, we use 10 cannibals and 10 missionaries instead of the usual 3 and 3. 10 cannibals and 10 missionaries are situated on the two banks of a river. They must all be moved onto one target bank. They can use a boat which can ship one or two persons. The cannibals should never outnumber the missionaries on either bank. The heuristic function used is the number of persons that are not yet located on the target bank.
• 10-stones: A puzzle that appeared in Nilsson's book [28]. Instead of using 3 black stones and 3 white stones we use 5 black stones and 5 white stones. The goal configuration is . A stone may move into an adjacent empty cell or may hop over one or two other stones into an empty cell. The heuristic function used is the number of stones not yet in their goal position.
• 5-Hanoi: The Towers of Hanoi problem with 5 rings in increasing sizes. There are 3 pegs and 5 rings. The goal position has the 5 rings placed on the left peg. A ring may be moved to the top of another peg provided that it is not placed on a smaller ring as a result. The heuristic function used is the number of rings not yet placed on their target peg.
• Grid: A grid of 50 x 50 with random parallel walls inserted to make the Manhattan-distance heuristic inaccurate. There are four basic operators (North, South, West and East). A goal state can be any location on the grid. The objective is to find a path from the initial location to the goal location. Figure 7 shows the grid used.
The emphasis of this paper is on building a general learning algorithm that is able to solve the NxN puzzle problem. The goal of testing the algorithm on these other domains was to show that the learning algorithm is indeed general and does not include domain-specific procedures.

Next: Learning to Solve Problems Up: Experimental Methodology Previous: Dependent Variables
Shaul Markovitch
1998-07-21