 
  
  
   
Encouraged by the success and flexibility of our initial solution, we next varied a parameter of the initial setup to test if the solution would extend further. Throughout Section 4, the passer started 35 units away from the ball and accelerated full-speed ahead until striking it. This process consistently propelled the ball at about 135 units/sec. To make the task more challenging, we varied the ball's speed by starting the passer randomly within a range of 32-38 units away from the ball. Now the ball would travel towards the shooter with a speed of 110-180 units/sec.
Before making any changes to the shooter's shooting policy, we tested the policy trained in Section 4 on the new task. However, we found that the 3-input NN was not sufficient to handle the varying speed, giving a success rate of only 49.1% due to the mistiming of the acceleration. In order to accommodate for this mistiming, we added a fourth input to the NN in order to represent the speed of the ball: Ball Speed
The shooter computed the Ball Speed from the ball's change in position over a given amount of time. Due to sensor noise, the change in position over a single time slice did not give an accurate reading. On the other hand, since the ball slowed down over time, the shooter could also not take the ball's total change in position over time. As a compromise, the shooter computed the Ball Speed from the ball's change in position over the last 10 time slices (or fewer if 10 positions had not yet been observed).
To accommodate for the additional quantity used to describe a world state, we gathered new training data. As before, the shooter used the random shooting policy during the training trials. This time, however, each training instance consisted of four inputs describing the state of the world plus an output indicating whether or not the trial was a successful one. Of the 5737 samples gathered for training, 963-or 16.8%-were positive examples.
For the purposes of training the new NN, we scaled the Ball Speed of the
ball to fall between 0.0 and 1.0:   .  Except for the fourth input, the new NN looked like the
one pictured in Figure 4(b).  It had two hidden units, a bias
unit at each level, and a learning rate of .001. Training this NN for
4000 epochs resulted in a mean squared error of .0512 with 651 of the
instances misclassified.
.  Except for the fourth input, the new NN looked like the
one pictured in Figure 4(b).  It had two hidden units, a bias
unit at each level, and a learning rate of .001. Training this NN for
4000 epochs resulted in a mean squared error of .0512 with 651 of the
instances misclassified.
Using this new NN and the same decision function over its output as before, or the 4-input NN shooting policy, our shooter was able to score 91.5% of the time with the ball moving at different speeds. Again, this success rate was observed in all four action quadrants.
   
Table 3: When the Ball Speed varies, an additional input is needed.
 
 
  
 