THE OPTIMAL REWARD PROBLEM OR WHERE DO REWARDS COME FROM? SATINDER SINGH Joint work with Jonathan Sorg and Richard Lewis Computer Science and Engineering, University of Michigan Impressive results have been obtained by research approaches to autonomous agents that start with a given reward function and focus on developing theory and algorithms for learning or planning policies that lead to high cumulative reward. In a departure from this work, we recognize that in many situations the starting point is an agent designer with a reward function seeking to build an autonomous agent to act on its behalf. What reward function should the designer build into the autonomous agent? In this new view, setting the parameters (agent's reward function) equal to the given preferences (designer's reward function) implements a preferences-parameters confound. If an agent is bounded, as most agents are in practice, we expect that breaking the preferences-parameters confound would be beneficial. We define the optimal reward problem, that of designing the agent's reward function from among a set of reward functions given a designer's reward function, an agent architecture, and a distribution over environments. The main focus of the talk will be on a discussion of some empirical and theoretical insights obtained by solving the optimal reward problem. BIO Satinder Singh is a Professor of Computer Science and Engineering at the University of Michigan. He is also presently serving as the AI Lab Director. He contributes to the research areas of reinforcement learning, decision-theoretic planning, and computational game theory.