Monday May 3, 1993; WeH 7220 ; 3:00 pm *************************************************************************** Reinforcement Learning in Continuous State and Action Spaces Gregory J. Karakoulas Knowledge Systems Laboratory Institute for Information Technology National Research Council, Canada ABSTRACT The control tasks in most reinforcement learning applications are of low-order, with finite state and action spaces, thereby making it feasible for the entire space of state-action combinations to be explored. Reinforcement learning (RL) does not scale well to complex tasks with continuous, or even infinite, state and action spaces. Teaching has been employed for accelerating RL. In addition, the combination of direct and indirect methods has been suggested when scaling issues of RL are considered. Building upon these two ideas, we propose the architecture of a hybrid RL agent that scales the typical RL agent to cope with a stochastic control task having real-valued, infinite state and action spaces. The task is to identify optimal control rules for planning under uncertainty about an economic system. The agent assesses its control performance through an internal critic, that exlpoits information about the local preferences of an external critic and a simulation model of the system being controlled. The Q~-learning algorithm embedded into the agent, extends Q-learning by incorporating a probabilistic search method. The performance of the algorithm is assessed though a simulation experiment in the class of LQG control problems.