Computer Science Thesis Proposal

  • Gates Hillman Centers
  • Traffic21 Classroom 6501
Thesis Proposals

Improving Sample Efficiency in Theory and Practice for Reinforcement Learning through Better Exploration

This thesis proposes using more sophisticated exploration techniques to construct new sample efficient algorithms and advance theory for more practical reinforcement learning settings, as well as adapt theoretically efficient exploration techniques to practical algorithms and the deep reinforcement learning setting. One proposed technique, directed exploration, involves explicitly performing exploration for specific goals, which can be used to accumulate useful information that can narrow down the possibility space of unknown parameters. Using directed exploration can improve sample complexity in a variety of more practical settings: when solving multiple tasks either concurrently or sequentially, algorithms can explore distinguishing state--action pairs to cluster similar tasks together and share samples to speed up learning; in large, factored MDPs, repeatedly trying to visit lesser known state—action pairs can reveal whether the current dynamics model is faulty and which features are unnecessary. Other techniques such as using data-dependent confidence intervals as a form of tempered optimism combined with explicit exploration towards gathering information about value gap between actions may result in better empirical performance as well as make progress towards tighter, problem-dependent bounds. Finally, these exploration techniques can be adapted to the deep reinforcement learning setting by reducing this setting back to the small, discrete setting by using deep learning as a state abstraction and discretizing the state representation.

Thesis Committee:
Emma Brunskill (Chair)
Drew Bagnell
Ruslan Salakhutdinov
Remi Munos (Google DeepMind)

Copy of Proposal Document

For More Information, Please Contact: