Explore or Exploit? Build a Model or not?: Results from a Chernoff-bounds based analysis of Reinforcement Learning Algorithms Satinder Singh (*) AT&T Labs-Research Much of the analysis of reinforcement learning (RL) algorithms has relied on tools from stochastic approximation theory to prove asymptotic convergence results for a number of popular RL algorithms. These results were crucial in gaining credibility for a young field of research and has, at least in part, led to the increasing use of RL algorithms by those interested in solving real RL problems. Nevertheless, many basic questions about the computational aspects of popular RL algorithms remained open. In the recent past, methods familiar to computational learning theorists, such as the use of sampling and Chernoff bounds, have begun to be applied to the RL problem. In this talk I will present some results obtained from this new kind of analysis of RL problems: How difficult is it to solve general RL problems?, How can one solve the exploration-exploitation dilemma?, Is it better to learn a model from the data and use a model-based RL method, or is it better to use a model-free RL method? * This is joint work with Michael Kearns.