12:00, 2 Oct 1996, WeH 7220
Exploiting Model Uncertainty Estimates for Safe Dynamic Control Learning
Jeff Schneider
Model learning combined with dynamic programming has been shown to be
effective for learning control of continuous state dynamic systems.
The simplest method assumes the learned model is correct and applies
dynamic programming to it, but many approximators provide uncertainty
estimates on the fit. How can they be exploited? This paper
addresses the case where the system must be prevented from having
catastrophic failures during learning.
We propose a new algorithm adapted from the dual control literature
and use Bayesian locally weighted regression models with stochastic
dynamic programming. A common reinforcement learning assumption is
that aggressive exploration should be encouraged. This paper
addresses the converse case in which the system has to reign in
exploration. The algorithm is illustrated on a 4 dimensional
simulated control problem.