next up previous
Next: Feudal Q-learning Up: Generalization Previous: Generalization over Actions

Hierarchical Methods


Another strategy for dealing with large state spaces is to treat them as a hierarchy of learning problems. In many cases, hierarchical solutions introduce slight sub-optimality in performance, but potentially gain a good deal of efficiency in execution time, learning time, and space.

Hierarchical learners are commonly structured as gated behaviors, as shown in Figure 8. There is a collection of behaviors that map environment states into low-level actions and a gating function that decides, based on the state of the environment, which behavior's actions should be switched through and actually executed. Maes and Brooks [68] used a version of this architecture in which the individual behaviors were fixed a priori and the gating function was learned from reinforcement. Mahadevan and Connell [72] used the dual approach: they fixed the gating function, and supplied reinforcement functions for the individual behaviors, which were learned. Lin [60] and Dorigo and Colombetti [38, 37] both used this approach, first training the behaviors and then training the gating function. Many of the other hierarchical learning methods can be cast in this framework.

Figure 8: A structure of gated behaviors.

Leslie Pack Kaelbling
Wed May 1 13:19:13 EDT 1996