next up previous
Next: Selective Experience: Generating Solvable Up: A Selective Macro-learning Algorithm Previous: Background: Selective Macro-Learning


MICRO-HILLARY

In this section we discuss MICRO-HILLARY--a particular instantiation of the architecture described above that is both very simple, yet powerful enough to perform efficient learning in a large class of problem domains. MICRO-HILLARY, like most macro-learners, works with satisficing [38] search programs, i.e., programs whose goal is to find solutions as fast as possible regardless of the length of the solution found [29, p. 14]. Such programs typically use heuristic functions which serve as preference predicates--the search strategy prefers to expand states with a lower value. In this work we consider heuristic functions which are not necessarily underestimating, but are always positive, except in the goal states where they have a zero value. We call such heuristic functions well-behaved. Many of these heuristic functions order states reasonably well except in a small number of local minima. The basic idea of MICRO-HILLARY is to acquire macro-operators for escaping from local minima, i.e., macros that lead from a local minimum to a state with a better heuristic value.

MICRO-HILLARY works by generating training problems and solving them. A good way to detect local minima is to use a search method such as hill-climbing, which is susceptible to local minima. Therefore, MICRO-HILLARY uses simple hill-climbing3 for its problem solver, both in the learning and in the problem-solving mode. When trapped in a local minimum, MICRO-HILLARY searches for a way out to a state with a better heuristic value, and records this escape route as a macro-operator. Let Q be a predefined limit. We say that MICRO-HILLARY is in quiescence when it solves Q consecutive training problems without acquiring any new macros. MICRO-HILLARY quits when it detects quiescence.

When used for solving externally supplied problems, MICRO-HILLARY uses the union of the basic operators and the learned macros. When faced with a local minimum, it searches for an escape route but does not acquire macros. An alternative version can perform on-line learning. Figure 2 shows the MICRO-HILLARY algorithm. To complete the algorithm we need to specify a method for generating training problems and a method for escaping from local minima. The SolveProblem procedure serves both for learning and for solving externally supplied problems.

Figure 2: The MICRO-HILLARY algorithm
\fbox{
\begin{minipage}{\textwidth}
\begin{tabbing}
\hspace*{0.5cm}\=\hspace{0.5...
...s ,o_k$\ such that $h(o_k(\ldots
o_1(s)\ldots),s_g) < h(s,s_g)$.
\end{minipage}}




next up previous
Next: Selective Experience: Generating Solvable Up: A Selective Macro-learning Algorithm Previous: Background: Selective Macro-Learning
Shaul Markovitch
1998-07-21