2:00, Friday April 8, WeH 7220 Monte Carlo Learning in Environments with Hidden State Satinder Pal Singh^* Department of Brain and Cognitive Sciences MIT Planning and search algorithms from AI and reinforcement learning (RL) algorithms from machine learning form a sound theoretical basis for building architectures than enable an agent to determine autonomously the ``optimal'' actions in certain classes of environments. Unfortunately all of the theory and most of the practice of the above algorithms is limited to problems where the agent can sense the state of the environment completely. In many real-world tasks, however, the state of the environment is ``hidden'', i.e., only partially observable by the agent. Most previous general approaches to hidden-state problems use computationally expensive techniques to estimate the state of the environment, and use the conventional algorithms on the estimated state. This talk addresses the question of what is the best an agent can do in hidden-state problems without resorting to any form of state estimation. We develop a new framework for solving such problems by including stochastic decision policies in the search space, and by defining the ``value'' or utility of a cluster of states. We also discuss why the conventional RL framework needs to be extended and present theoretical results about what popular RL algorithms do when applied to hidden-state problems. Finally, we present a new, computationally tractable, Monte Carlo algorithm for finding locally optimal stochastic decision policies in a class of hidden-state problems. *Joint work with Tommi Jaakkola and Michael Jordan