Markov Decision Process (discrete)
s2
s3
s4
s5
s1
0.7
0.3
0.9
0.1
0.3
0.3
0.4
0.99
0.1
0.2
0.8
r=-10
r=20
r=0
r=1
r=0
[Bellman 57] [Howard 60] [Sutton/Barto 98]
Previous slide
Next slide
Back to first slide
View graphic version