CMU RI 16-711: KDC: Assignment 1 Solution(s)


Parameter optimization

Here are the matlab example of parameterizing the trajectory and then using parameter optimization to tune it. tar file and a zip file. Look at the file notes.txt. You want to execute the statements in notes.txt in order to get the results shown. Below I graph the trajectory with one interior knot point after optimization:


Dynamic Programming

The goal of the second part was to get people thinking about how to generate policies that work well for a wide range of initial conditions. One approach is dynamic programming, which produces a value function and a policy. The policy is global, in that it can be used to find the appropriate action for any state. Trajectory from dynamics programming. The trajectory pumped the pendulum several times before swinging up. Slightly different numbers of pumps are also close to this trajectory in cost. Here is the Part 1 solution trajectory.


Here are the value function and optimal policy for this problem.

Value function: oblique view.

Angle is along X axis, angular velocity is along Y axis (away from viewer).
Trajectory starting point is top of red hill.

Value function: top view with a Part 1 trajectory superimposed.

Angle is along horizontal axis, angular velocity is along vertical axis. Part 1 trajectory starting point is top of red hill.

Value function: contourmap with a Part 1 trajectory superimposed.

Angle is along horizontal axis, angular velocity is along vertical axis.

Policy: oblique view.

Angle is along X axis, angular velocity is along Y axis (away from viewer).

Policy: top view with a Part 1 trajectory superimposed.

Angle is along horizontal axis, angular velocity is along vertical axis.

Policy: contourmap with a Part 1 trajectory superimposed.

Angle is along horizontal axis, angular velocity is along vertical axis.