TD Learning Demo with Stochastic Actions

About

Temporal Difference Learning Gridworld Demo

Fast Speed Normal Speed Slow Speed

Input Maze File Url: Download Maze File as json: Download

Exploration epsilon: 0.15

Gamma discount factor: 0.15

Alpha learning rate: 0.15

Edit the Maze:

Stochastic Probabilities