Wednesday 19 October, 12:00, WeH 1327 Adaptive linear quadratic control using policy iteration Erik Ydstie (Professor of Chemical Engineering, CMU) We present stability and convergence results for Dynamic Programming- based reinforcement learning applied to Linear Quadratic Regulation (LQR). The specific algorithm we analyze is based on Q-learning and it is proven to converge to the optimal controller provided that the underlying system is controllable and a particular signal vector is persistently excited. The performance of the algorithm is illustrated by applying it to a model of a flexible beam and an extension to the nonlinear case is discussed. (*) joint work with Steven Bradtke and Andy Barto