CMU RI 16-711: KDC: Assignment 1 Solution(s)
Parameter optimization
Here are the matlab example of parameterizing the trajectory and
then using parameter optimization to tune it.
tar file
and a zip file. Look at the
file notes.txt. You want to execute the statements in notes.txt in order
to get the results shown. Below I graph the trajectory with one interior
knot point after optimization:
Dynamic Programming
The goal of the second part was to get people thinking about how to
generate policies that work well for a wide range of initial conditions.
One approach is dynamic programming, which produces a value
function and a policy. The policy is global, in that it can be used to
find the appropriate action for any state.
Trajectory from dynamics programming.
The trajectory pumped the pendulum several times before swinging up.
Slightly different numbers of pumps are also close to this trajectory
in cost. Here is the Part 1 solution trajectory.
Here are the value function
and optimal policy for this problem.
Value function: oblique view.
Angle is along X axis, angular velocity is along Y axis (away from viewer).
Trajectory starting point is top of red hill.
Value function: top view with a Part 1
trajectory superimposed.
Angle is along horizontal axis, angular velocity is along vertical axis.
Part 1 trajectory starting point is top of red hill.
Value function: contourmap with a Part 1
trajectory superimposed.
Angle is along horizontal axis, angular velocity is along vertical axis.
Policy: oblique view.
Angle is along X axis, angular velocity is along Y axis (away from viewer).
Policy: top view with a Part 1
trajectory superimposed.
Angle is along horizontal axis, angular velocity is along vertical axis.
Policy: contourmap with a Part 1
trajectory superimposed.
Angle is along horizontal axis, angular velocity is along vertical axis.