STAGE is a search technique which learns a problem-specific heuristic evaluation function as it searches. The heuristic is trained by least-squares TD(lambda) to predict, from features of states along the search trajectory, how well a fast Markovian search method such as hillclimbing will perform starting from each state. Search proceeds by alternating between two stages: performing the fast search to gather new training data, and following the learned heuristic to reach a promising new start state.

STAGE has produced good results on a variety of combinatorial optimization domains, including VLSI channel routing, Bayes net structure-finding, bin-packing, Boolean satisfiability, radiotherapy treatment planning, and geographic cartogram design. It provides strong evidence that reinforcement learning methods can be efficient and effective on large-scale decision problems.

More Information

Back to Glossary Index