| Date |
Topic |
Notes |
Useful links |
|
|
|
|
|
| Aug 26 Tuesday |
Intro to Decision Making |
Lec1 |
|
| Aug 28 Thursday |
Experimental Design |
Lec2 |
On Computationally Tractable Selection of Experiments Near optimal Design of experiments via Regret Minimization
Combinatorial Algorithms for Optimal Design |
| Sept 2 Tuesday |
Experimental Design; Multi Armed Bandits |
Lec3 |
Coreset selection NNs Intro to Multi-Armed Bandits, Ch1 |
| Sept 4 Thursday |
Adaptive exploration, Martingale bounds |
Lec4 |
Intro to Multi-Armed Bandits, Ch1
RL_theory book AJKS, Lemma6.2 |
| Sept 9 Tuesday |
UCB; lower bound |
Lec5 |
Intro to Multi-Armed Bandits, Ch1,2 |
| Sept 11 Thursday |
Nonparametric Bandits |
Lec6 |
Intro to Multi-Armed Bandits, Ch4 |
| Sept 16 Tuesday |
Linear Bandits |
Lec7 |
Reinforcement Learning: Theory & Algorithms, Ch6, Dani et al, Abbasi et al |
| Sept 18 Thursday |
Linear Bandits concentration |
MAB Motivation Lec8 |
Intro to Multi-Armed Bandits, Ch3 |
| Sept 23 Tuesday |
Bayesian regret, Thompson Sampling |
Lec9 |
Intro to Multi-Armed Bandits, Ch3 Optimize via posterior sampling, Russo_VanRoy GP-UCB paper |
| Sept 25 Thursday |
Online learning with experts |
Lec10 |
Intro to Multi-Armed Bandits, Ch5 |
| Sept 30 Tuesday |
Adversarial Bandits |
Lec11 |
Intro to Multi-Armed Bandits, Ch6 |
| Oct 2 Thursday |
Contextual Bandits |
Lec12 |
Intro to Multi-Armed Bandits, Ch8 |
| Oct 7 Tuesday |
Generalized Bandits |
Lec13 |
Square-CB paper |
| Oct 9 Thursday |
Markov Decision Processes |
Lec14 |
RL_theory book AJKS, Ch 1.1,1.2 |
| Oct 14 Tuesday |
No Class - FALL BREAK |
| Oct 16 Thursday |
No Class - FALL BREAK |
| Oct 21 Tuesday |
Value and Policy Iteration |
Lec15 |
RL_theory book AJKS, Ch 1.3 |
| Oct 23 Thursday |
Tabular MDP |
Lec16 |
RL_theory book AJKS, Ch 7 |
| Oct 28 Tuesday |
Tabular MDP |
|
RL_theory book AJKS, Ch 7 |
| Oct 30 Thursday |
Linear MDP |
|
LSVI-UCB paper |
| Nov 4 Tuesday |
No Class - ELECTION DAY |
| Nov 6 Thursday |
General Function Approximation |
|
RL_theory book AJKS, Ch 9,General_func_Akshaynotes |
| Nov 11 Tuesday |
General Function Approximation; Policy Gradient |
|
RL_theory book AJKS, Ch 9,11 |
| Nov 13 Thursday |
Policy Gradient |
|
RL_theory book AJKS, Ch 11 |
| Nov 18 Tuesday |
Offline RL |
|
RL_theory book AJKS, Ch 4 |
| Nov 20 Thursday |
Project Presentations |
|
|
| Nov 25 Tuesday |
Project Presentations |
|
|
| Nov 27 Thursday |
No Class - THANKSGIVING |
| Dec 2 Tuesday |
Hybrid RL |
|
Hybrid_YudaNotes |
| Dec 4 Thursday |
Reinforcement Learning from Human Feedback |
|
RLHF_paper |