The goal of a learning system is to modify
a performance element so as to
improve its performance
with respect to some given criterion. The learning
system
does so by using its experience to generate knowledge for use by the
performance element. Most of the machine learning community is
concerned with improving the *classification accuracy* of *classifiers* based on *classified examples*. A smaller part of
the
community is concerned with improving the *speed* of *search
programs* based on *problem-solving experience*. This type of
learning is
commonly termed *speedup learning* [40].

One of the most common methods of speedup learning is the acquisition
of macro-operators
[4,10,13,19,21,23,34].
Given the traditional definition of
a search space with a set of states and a set of basic operators that
connect them, a macro-operator is defined as a finite sequence of
basic
operators. Macro-operators are typically acquired during problem solving and
are used in the same manner as basic operators. The process of
acquiring macros is relatively simple. A system needs only to solve
training problems and pass the search tree to the acquisition
procedure, which, in turn, can add
any sub-sequence of operators from the tree
to its
macro knowledge-base. The acquired macros, however, carry costs as
well as benefits. One of the most significant costs
associated with using macros is the
increased branching factor of the search space.
When the costs outweigh the benefits, we face a
phenomenon called the *utility problem*
[24,7,22,26].

Due to the very large number of macros available for acquisition, a learning program must be selective in order to obtain a macro set with high utility. The goal of this research is to demonstrate that a simple macro-learning technique, combined with the right selection mechanisms, can lead to a speedup learning algorithm that is powerful and general, yet simple as well.

We start by defining a framework for selective macro-learning
and describe the general architecture of a macro learner.
The *information filtering* framework [22]
is used to describe
the various logical components of a macro learner. In particular,
the framework emphasizes the important role of the selection
mechanisms used during the learning process.
We continue by describing the MICRO-HILLARY algorithm.
To make the presentation and the experiments clearer,
we present the algorithm in two stages. We first
describe a simplified version of the algorithm that
learns macros for a single domain at a time. We then show a generalized
version that can handle a family of domains.

MICRO-HILLARY, like all other macro-learners, is useful for
*satisficing* search.
Macro-learning has a negative utility when used for *optimizing search*
[20]. Even when the learned macros
are optimal they are not useful for optimizing search procedures:
knowing the shortest way
from point *A* to point *B* and the shortest way from
*B* to *C* does not tell us anything about
the optimality of their concatenation.

The input for the MICRO-HILLARY algorithm is a domain, specified
by a set of basic operators, a heuristic function (not necessarily
admissible) for evaluating the merit of
states, and a function for generating random goal states.
The MICRO-HILLARY algorithm performs *learning by experimentation*
[25].
It generates solvable
training problems with an increasing level of difficulty.
The training problems are solved by a search program that
performs hill-climbing combined with a procedure for escaping
local minima. The escape routes are then recorded as macros.
The problem solver uses the same search procedure, using
macros as if they were basic operators.
MICRO-HILLARY performs the maximal possible generalization
by trying all the macros in all the states. Such a method increases
the branching factor by the number of macros and would be too
expensive when used in search procedures such as best-first search. Using
this method in hill-climbing significantly reduces this overhead (unless
the solutions found are significantly longer).

The architecture of MICRO-HILLARY contains building blocks that are inspired
by the earlier works of many other researchers.
Generating training problems with increasing
difficulty was done *manually* by several researchers
[9,23,25].
EASe [34] used hill-climbing to avoid exponential
increase in search time as a result of adding macros. MORRIS [23] and MACLEARN [9] used
macro selection methods that inspired the one used by MICRO-HILLARY. The
MICRO-HILLARY architecture is a well-balanced and well-tuned combination of
these building blocks that emerged after extensive experimental study.

The main domain upon which MICRO-HILLARY was tested is
the general NxN sliding-tile puzzle [31] which
includes
as special cases the 8-puzzle, 15-puzzle and
the 24-puzzle. The 8-puzzle and
the 15-puzzle problems have long been popular among
mathematicians [12,18,42]
and AI researchers.
The sliding-tile puzzles are among the most commonly used
domains for testing heuristic search algorithms and
macro-learning
[1,9,13,19,34,35].
The reason for the popularity of
sliding-tile puzzles is the simplicity of their specification
combined with the very large size of their associated
search space [29, p. 4].
An NxN puzzle problem has an associated search graph
of size *N*^{2}!/2.
The *optimizing* NxN puzzle problem
is NP-hard [31], although some research efforts
have been made in finding optimal solutions to the
15-puzzle [15]
and the 24-puzzle [17].
It is not difficult to devise
a domain-specific algorithm for solving the *satisficing*
NxN puzzle problem [31].
However, the attempts to find even
non-optimal solutions for the puzzle *using weak methods*
were successful only for small puzzles.
A notable exception is the paper by Korf [16], who
outlined a search algorithm for solving NxN puzzles.
MACLEARN[10], for example, one of the most successful
macro-learners, was demonstrated solving only a single
5 x 5 puzzle problem.

It is therefore a worthwhile challenge to devise a simple, domain-independent, learning algorithm that will be able to acquire the ability to efficiently solve any solvable NxN puzzle problem. The paper includes a comprehensive set of experiments that test the properties of the algorithm mostly in the NxN puzzle domain. We show that the learned macros can solve very large puzzles, and supply a formal proof that the generated macros are sufficient for efficiently solving puzzles of any size.

While the NxN puzzle is the main domain tried, the MICRO-HILLARY algorithm is completely domain-independent. We made efforts to build a minimal domain-independent architecture that is sufficient to solve the NxN puzzle, as well as problems in other domains, without introducing domain-dependent enhancements. We include experiments in other domains to demonstrate that the algorithm is indeed domain-independent.

Section 2 defines selective macro-learning and
identifies the choices to be made when designing a macro-learning
system.
Section 3 describes the general MICRO-HILLARY^{1} algorithm
for learning macro-operators. Section 4
describes experiments with MICRO-HILLARY in various domains, mainly
the 15-puzzle domain. Section 5 describes
an extension of MICRO-HILLARY that is able to learn a family of domains
that are differentiated by a single numeric parameter, and
describes the application of the parameterized algorithm
to the family of NxN puzzle domains. Finally, Section
6
discusses and sums up MICRO-HILLARY's strengths and weaknesses,
in comparison to other macro-learning algorithms.