12:00, Wed 28 Feb 1996, WeH 7220 Scaling Issues in Reinforcement Learning Leemon Baird This talk will address two problems that arise when scaling up reinforcement learning to large problems: continuous time, and function approximators. Q-learning does not work for control problems with continuous time, or small time steps, or a discount factor near 1. The Advantage Learning algorithm works well in this case, and can learn much faster than Q-learning. Traditional reinforcement-learning algorithms are guaranteed to converge when lookup tables are used, but not when general function approximators such as standard neural nets are used. This problem is solved by a class of algorithms known as "residual algorithms". This talk will cover these problems and solutions, and briefly describe empirical results using a combination of these algorithms.