next up previous
Next: Experimental Setup Up: Learning Method Previous: Learning Method

Motivation

Imagine yourself on a soccer field with the ball at your feet, 25 yards from the opponents' goal. There is a single defender (the goalie) between you and the goal, and a teammate of yours is off to one side, unmarked, and ready to receive a pass. Imagine also, if you must, that you are a skilled soccer player, and that you enjoy playing the game immensely. You would like nothing more than to help your team score a goal.

In this situation, you have two options: shoot the ball directly at the goal or pass the ball to your teammate so that she can take a shot. The first time you find yourself in this situation, you do not know what to do. You randomly decide to shoot or to pass and you take note of the result. As you continue to play, however, you find yourself back in the same position repeatedly. You are in the same position with the ball, and you have a teammate in the same position ready to receive a pass. The only difference is that the defender's starting position is never quite the same. Furthermore, since you and your teammate are remarkably consistent in your abilities, the ball travels along the same path every time you shoot; even when you pass the ball, your teammate redirects it in the same way every time. That is, the ball's motion is completely deterministic once you have either shot or passed it.

In contrast, the defender is continuously moving in front of the goal, and his motion may or may not be deterministic. At first you shoot or pass randomly, but after a few successful attempts, you begin to learn, based on the defender's position, whether to shoot or to pass. Even if the defender's motion is in fact deterministic, you can never be entirely sure of scoring unless you have scored before when the defender started in exactly the same position.



next up previous
Next: Experimental Setup Up: Learning Method Previous: Learning Method



Peter Stone
Mon Dec 11 15:42:40 EST 1995