Date Topic Teacher Links
8/31 No class this week, CSD immigration course
9/2 No class this week, CSD immigration course
9/7 Labor day, no class
9/9 Overview Brunskill slides
9/14 Monte carlo estimation, TD(0), and Fitted Value Iteration Brunskill notes
9/16 Fitted Value Iteration Brunskill FQI paper (Ernst et al. 2005), lecture notes, M.Ghavamzadeh's lecture notes on AVI, API
9/21 Least Squared Policy Iteration Brunskill lecture notes, LSPI (Lagoudakis and Parr, 2003)
9/23 Approximate Model-based learning Brunskill lecture notes, lecture slides (different than notes), "An Analysis of Linear Models, Linear Value-Function Approximation, and Feature Selection for Reinforcement Learning"
9/28 Constructing a good set of features Brunskill lecture notes, lecture slides (different than notes), "Greedy Algorithms for Sparse Reinforcement Learning"
9/30 Constructing a good set of features 2 Brunskill lecture notes, lecture slides from 9/28, Batch iFDD: A Scalable Matching Pursuit Algorithm for Solving MDPs
10/5 Evaluating the output of Batch RL Methods Brunskill lecture notes, lecture slides (read these then notes for temporal ordering), "An analysis of model-based interval estimation for Markov decision processes" paper
10/7 Evaluating the output of Batch RL Methods Brunskill/Thomas Slides from start of class lecture notes on Bias and Variance approach Phil Thomas's slides "Bias and and Variance Approximation in Value Function Estimates" paper
10/12 Importance Sampling Approaches to evaluating Batch RL Thomas lecture slides, paper: "High Confidence Off-Policy Evaluation" paper: "High Confidence Policy Improvement"
10/19 Selecting among Models in Batch RL for future performance Brunskill lecture notes, lecture slides, "Offline Policy Evaluation Across Representations with Applications to Educational Games", "Model Selection in Markovian Processes"
10/21 Online learning Brunskill lecture notes, "Incremental Model-based Learners with Formal Learning-Time Guarantees"
10/26 Regret bounds Brunskill lecture notes, lecture slides
10/28 Project meetings
11/2 Bayes-optimal RL Brunskill POMDP lecture notes (mostly background reference), lecture slides, "Monte-Carlo Planning in Large POMDPs", "Scalable and Efficient Bayes-Adaptive Reinforcement Learning Based on Monte-Carlo Tree Search", "Bayes-Optimal Reinforcement Learning for Discrete Uncertainty Domains"
11/4 Sample Efficient Model-based RL Brunskill lecture slides, "Gaussian processes for sample efficient reinforcement learning with RMAX-like exploration", "TEXPLORE: Real-Time Sample-Efficient Reinforcement Learning for Robots"
11/9 Policy Search: Policy Gradient Brunskill lecture slides, Scribed notes from Pieter Abbeel's class that include derivation I replicated on the board, see pages 1-2, "Policy Gradient Methods for RL with Function Approximation"
11/11 Policy Search: Sample Efficiency with Bayesian Optimization Brunskill lecture slides, Ryan Adam's intro to Bayesian Optimization, "Bayesian Optimization for Learning Gaits under Uncertainty"
11/16 RL for DARPA Robotics Challenge & Pouring Tasks Akihiko Yamaguchi lecture slides
11/18 Risk Sensitive RL: Optimizing CVaR Brunskill lecture slides, "Optimizing the CVaR Via Sampling", A.Tamar's PhD thesis, see sections 1.2.1 for different risk-sensitive objectives that can be of itnerest
11/23 Safe Exploration Brunskill (rough) lecture notes to support paper presentation, "Safe Exploration in MDPs"
11/25 Thanksgiving break
11/30 Why doesn't the stuff you learn in class work in real life? A robotics-focused perspective. Chris Atkeson Dynamic Optimization class website
12/2 Inverse Reinforcement Learning Brunskill lecture notes, "Maximum Entropy Inverse Reinforcement Learning", Abbeel's slides on IRL
12/7 Students Project Presentations
12/9 Students Project Presentations