STAGE is a search technique which learns a problem-specific heuristic
evaluation function as it searches. The heuristic is trained by
least-squares TD(lambda) to predict, from features of states along the
search trajectory, how well a fast Markovian search method such as
hillclimbing will perform starting from each state. Search proceeds
by alternating between two stages: performing the fast search to
gather new training data, and following the learned heuristic to reach
a promising new start state.
STAGE has produced good results on a variety of combinatorial optimization domains, including
VLSI channel routing, Bayes net
structure-finding, bin-packing, Boolean
satisfiability, radiotherapy treatment planning, and geographic
cartogram design. It provides strong evidence that reinforcement learning methods can be efficient and
effective on large-scale decision problems.
More Information
- This paper outlines STAGE and presents many experimental
Boyan, J. A. and A. W. Moore (1998). "Learning Evaluation Functions
for Global Optimization and Boolean Satisfiability." Fifteenth
National Conference on Artificial Intelligence (AAAI). (postscript)
Back to Glossary Index