Discussion and Conclusion



next up previous
Next: Acknowledgements Up: Broad Learning from Previous: Moving the Goal

Discussion and Conclusion

 

The experiments described in this paper demonstrate the power of using flexible inputs to learning algorithms. For two different learned policies, we achieved good performance in situations much broader than their training situations. Most important was the generalization of the the shooter's shooting policy. Trained in one action quadrant with the goal in a fixed place and the ball moving with a fixed trajectory, it was used successfully in all four action quadrants with the ball moving at different trajectories and with the goal in different locations. The shooter's aiming policy also generalized to the three action quadrants that were not used during training as well as to different goal positions.

The secret behind the flexibility of these learned policies was that their inputs (and outputs) were coordinate-independent. All of the measurements represented the relative positions of the ball, the shooter and the goal, with no mention of or . Had we used the coordinates as inputs, we would have had to train new NNs for every new situation. We may have been able to find a single NN that could learn a disjunctive concept with more variables, but we would have had to at least provide new training data to cover the new situations.

By using these flexible inputs, we can incrementally teach general behavior templates to our robots, building them on top of each other. Similar to human soccer players, they can first learn to make contact with a moving ball, then learn to aim it, and only then start thinking about trying to beat an opponent and about team-level strategies.

Our future research agenda includes both of these higher level challenges. We will collect several different behaviors learned in a similar manner to the shooting described in this paper. Then we will use them to plan how to work as a part of a team against hostile opponents. We will put our robots through an intense training regimen to prepare them for RoboCup '97.



next up previous
Next: Acknowledgements Up: Broad Learning from Previous: Moving the Goal



Peter Stone
Wed Nov 8 14:49:26 EST 1995