Date 
Topic 
Teacher 
Links 
8/31 
No class this week, CSD immigration course
 

9/2 
No class this week, CSD immigration course
 

9/7 
Labor day, no class 


9/9 
Overview
 Brunskill 
slides 
9/14 
Monte carlo estimation, TD(0), and Fitted Value Iteration 
Brunskill 
notes 
9/16 
Fitted Value Iteration 
Brunskill 
FQI paper (Ernst et al. 2005), lecture notes, M.Ghavamzadeh's lecture notes on AVI, API 
9/21 
Least Squared Policy Iteration 
Brunskill 
lecture notes, LSPI (Lagoudakis and Parr, 2003) 
9/23 
Approximate Modelbased learning 
Brunskill 
lecture notes, lecture slides (different than notes), "An Analysis of Linear Models, Linear ValueFunction Approximation, and Feature Selection for Reinforcement Learning"

9/28 
Constructing a good set of features 
Brunskill 
lecture notes, lecture slides (different than notes), "Greedy Algorithms for Sparse Reinforcement Learning"

9/30 
Constructing a good set of features 2 
Brunskill 
lecture notes, lecture slides from 9/28, Batch iFDD: A Scalable Matching Pursuit Algorithm for Solving MDPs

10/5 
Evaluating the output of Batch RL Methods 
Brunskill 
lecture notes, lecture slides (read these then notes for temporal ordering), "An analysis of modelbased interval estimation for Markov decision processes" paper

10/7 
Evaluating the output of Batch RL Methods 
Brunskill/Thomas 
Slides from start of class lecture notes on Bias and Variance approach Phil Thomas's slides "Bias and and Variance Approximation in Value Function Estimates" paper

10/12 
Importance Sampling Approaches to evaluating Batch RL 
Thomas 
lecture slides, paper: "High Confidence OffPolicy Evaluation" paper: "High Confidence Policy Improvement"

10/19 
Selecting among Models in Batch RL for future performance
 Brunskill 
lecture notes, lecture slides, "Offline Policy Evaluation Across Representations with Applications to Educational Games", "Model Selection in Markovian Processes"

10/21 
Online learning
 Brunskill 
lecture notes, "Incremental Modelbased Learners with Formal LearningTime Guarantees"

10/26 
Regret bounds
 Brunskill 
lecture notes, lecture slides

10/28 
Project meetings
 

11/2 
Bayesoptimal RL
 Brunskill 
POMDP lecture notes (mostly background reference), lecture slides, "MonteCarlo Planning in Large POMDPs", "Scalable and Efficient BayesAdaptive Reinforcement Learning Based on MonteCarlo Tree Search", "BayesOptimal Reinforcement Learning for Discrete Uncertainty Domains"

11/4 
Sample Efficient Modelbased RL
 Brunskill 
lecture slides, "Gaussian processes for sample efficient
reinforcement learning with RMAXlike
exploration", "TEXPLORE: RealTime SampleEfficient Reinforcement Learning for Robots"

11/9 
Policy Search: Policy Gradient
 Brunskill 
lecture slides, Scribed notes from Pieter Abbeel's class that include derivation I replicated on the board, see pages 12, "Policy Gradient Methods for RL with Function Approximation"

11/11 
Policy Search: Sample Efficiency with Bayesian Optimization
 Brunskill 
lecture slides, Ryan Adam's intro to Bayesian Optimization, "Bayesian Optimization for Learning Gaits under Uncertainty"

11/16 
RL for DARPA Robotics Challenge & Pouring Tasks
 Akihiko Yamaguchi 
lecture slides

11/18 
Risk Sensitive RL: Optimizing CVaR
 Brunskill 
lecture slides, "Optimizing the CVaR Via Sampling", A.Tamar's PhD thesis, see sections 1.2.1 for different risksensitive objectives that can be of itnerest

11/23 
Safe Exploration
 Brunskill 
(rough) lecture notes to support paper presentation, "Safe Exploration in MDPs"

11/25 
Thanksgiving break



11/30 
Why doesn't the stuff you learn in class work in real life? A roboticsfocused perspective.
 Chris Atkeson 
Dynamic Optimization class website

12/2 
Inverse Reinforcement Learning 
Brunskill 
lecture notes, "Maximum Entropy Inverse Reinforcement Learning", Abbeel's slides on IRL

12/7 
Students
 Project Presentations 

12/9 
Students
 Project Presentations 
