4.4 Robot Arm, New Environment

Next: 5 Analysis of Results Up: 4 Experiments Previous: 4.3 Robot NavigationNew

4.4 Robot Arm, New Environment

The fourth experiment is essentially the same as the third experiment except in the robot arm domain. Here, three, hand crafted, configurations of a single obstacle with the goal in a fixed position were used, as shown in Figure 32. To increase the statistical variation each configuration was run five times with a different random seed. The curves in Figure 33 are therefore the average across 15 experimental runs.

Figure 32: The Different Obstacle Positions

Figure 33: Learning Curves: Robot Arm, New Environment

The top curve of Figure 31 is the Q-learning algorithm, the bottom curve the function composition system. The knee of the function composition system's curve occurs at about 4400 steps. The knee of the basic Q-learning algorithm at about 68,000 steps giving a speed up of about 15.

Chris Drummond
Thursday January 31 01:30:31 EST 2002