Selective Experience: Generating Solvable Problems with Increasing Levels of Difficulty

Next: Selective Attention: Acquiring Macros Up: MICRO-HILLARY Previous: MICRO-HILLARY

Selective Experience: Generating Solvable Problems with Increasing Levels of Difficulty

The basic architecture of a macro-learning system requires a mechanism for generating training problems and a method for filtering them (or, alternatively, a method for ordering them). The main goal of the filter is to save learning resources by concentrating on helpful problems. To focus the attention of the learner, our acquisition method considers only macros that are parts of the solution path. Therefore, to save learning resources, a good problem generator should generate only solvable problems. In addition, it is desirable to order the training problems in increasing levels of difficulty. This saves learning resources by acquiring as much knowledge as possible using easy problems.

One good way to implement such a problem generation methodology is to require a domain-specific generator for each domain. For example, solvable problems in the NxN puzzle domain can be generated by performing an even permutation on the goal state. Since one of our main goals was to make the algorithm as domain-independent as possible, we have supplied a default domain-independent generator that works in many domains, including the sliding-tile puzzle. Our default problem generator assumes the following:

The specification of the problem domain includes an algorithm for generating random goal states (unless there is only one goal state).
All the operators are reversible. If there is an operator connecting state s₁ to state s₂ then there is also a sequence of operators connecting s₂ to s₁:

$\begin{displaymath} \forall s_1,s_2 \in S \left [ \exists o \in O \left [ s_2=o(... ...ight \rangle, o_i \in O \left [ s_1=O'(s_2) \right ] \right ]. \end{displaymath}$

Alternatively, the specification of the domain includes a list of reverse operators.

Problems are generated by applying a random sequence of operators to a randomly generated goal state. This process is known as random walk. The length of the random sequence is increased after each problem generation. In some domains, when the random sequences are short enough compared to the size of the search space, and when there are no ``bottlenecks'' in the search graph, longer sequences are likely to yield ``harder'' problems (problems with larger distance between the start and goal states). In other domains the hardness of the problems may be uniformly distributed and not correlate with the length of the random walk. In such domains we can use the heuristic function as an estimate of the problem difficulty by which the random problems will be ordered. The generator sets a threshold on the heuristic distance between the initial state and the goal state and performs a random walk until a state with the required heuristic distance is found. The threshold is incremented with each iteration. This approach is still problematic in domains with ``bottlenecks'' where parts of the search graph are difficult to access. For such domains it is best to use a domain-dependent generator.

In Section 5 we describe a method for increasing the difficulty of training problems by increasing the parameter that characterizes the domain (such as N in the case of NxN puzzles). For most parameterized domains, such as the NxN puzzle, increasing the parameter indeed increases the difficulty of problems. The framework of processing input examples by order of difficulty, called learning from exercises, was studied formally by Natarajan [27] and empirically by Reddy and Tadepalli [32]. Both works assume a teacher who supplies a sequence of problems in increasing order of difficulty. Our default generator attempts to produce such a sequence automatically, freeing the learner from the need of a teacher.

Next: Selective Attention: Acquiring Macros Up: MICRO-HILLARY Previous: MICRO-HILLARY

Shaul Markovitch
1998-07-21