16745: Dynamic Optimization
Spring 2015
Instructor: Chris Atkeson, cga at cmu
TA: Sasanka Nagavalli snagaval at andrew
MW 34:20 NSH 3002
The TA will have office hours after class Monday and Wednesday.
Events of Interest
Last year's course

Jan 12: Introduction to the course.
Goal: Introduce course.
This years emphases is TO BE DETERMINED
Possibilities:

Low level parallelization of quadratic programming (QP) across N cores on one
machine.

Robust control.

Jan 12: Function Optimization Example
Goal: Introduce you to a useful tool, MATLAB
and its optimization subroutines, and show you how to use them on an example.
Robotics: redundant inverse kinematics.
Using Matlab's fminsearch and fminunc.
Using Matlab's fminsearch and fminunc, with
desired posture.
Using Matlab's fmincon.
Relationship of Jacobian approach to gradient descent.

Jan 14:
Function optimization using
first and
second
order gradient methods.
Goal: Review gradient descent approaches.
A nice chapter on function optimization techniques:
Numerical Recipes in C, chapter 10
(2nd or 3rd edition, 2nd edition is electronically available for free
under Obsolete Versions):
Minimization or Maximization of Functions,
This material from any other numerical methods book is also fine.
Resources:
Matlab fminunc,
Numerical Recipes,
GSL,
AMPL,
NEOS,
software list 1,
Useful
software guide,
gradient method,
line search,
conjugate gradient,
conjugate gradient v2,
quasiNewton/variable metric methods, and
Newton's method.

Jan 19: No Class

Jan 21: Nongradient ("derivativefree") function optimization methods:
Goal: Review nongradient approaches.
hill climbing
(including
local search,
local unimodal sampling,
pattern search,
random search,
random optimization),
Nelder Mead/Simplex/Amoeba method,
Matlab fminsearch,
simulated annealing,
fit surfaces (for example
Response Surface Methodology (RSM),
Memorybased Stochastic Optimization, and
Q2),
evolutionary algorithms,
genetic algorithms,
and ...
Paper:
Derivativefree optimization: A review of algorithms and comparison of software implementations by Luis Miguel Rios and Nikolaos V. Sahinidis,
Book: Introduction to DerivativeFree Optimization

Jan 2126:
Covariance Matrix Adaptation Evolution Strategy.
See also Hansen web page.
Example1,
Ex2,
Ex3,
Ex4.

Jan 26: Constraints.
Soft/hard constraints, penalty functions,
Barrier functions,
Lagrange Multipliers,
Augmented Lagrangian method,
Interior point methods vs. Simplex methods vs. soft constraint methods,

Jan 2628:
Quadratic Programming and
Sequential quadratic programming,
Matlab fmincon.
SNOPT,
CVXGEN

Jan 28: Automatic differentiation

Jan 28: Dynamics and Numerical Integration
Continous time, discrete time. Euler integration, Forward and inverse dynamics. Linearization.

Feb 2: Handling 3D Orientation:
Metrics for how close two orientations
are.
Metrics for 3D Rotations: Comparison and Analysis,
RigidBody Attitude Control: Using Rotation Matrices for Continuous, SingularityFree Control Laws,
ClosedLoop Manipulator Control Using Quaternion Feedback
Rotation matrix for small rotations

Feb 4: Formulating trajectory optimization as function optimization.
Examples of formulating a trajectory optimization problem
as a function optimization problem:
Case Studies In Trajectory Optimization: Trains, Planes, And Other
Pastimes,
Robert J. Vanderbei
Example use of AMPL
A free trial version of AMPL is available from here.
AMPL is also available for remote use through the Neos Server.
Click on SNOPT/[AMPL Input] under Nonlinearly Constrained Optimization.
Example use of Matlab: pend1xu,
pend1u,
pend1x
Spacetime Optimization: Witkin paper text
Witkin paper figures

Feb 4:
Use of splines in trajectory optimization.
Cubic Hermite spline.
Quintic Hermite interpolation.
Example 1,
Collocation,
Pseudospectral X.

Feb 9:
Policy optimization I: Use function optimization.
What is a policy?
Known in machine learning/reinforcement learning as policy search or refinement, ...
slides
See examples in CMAES section for policy optimization.

Feb 9: Ways to robustify function optimization:
Problems: How choose method?, more of an art than a science, local minima, bad answers, discontinuities, redundant/rank deficient constraints,
bad scaling, no formulas for derivatives, you are lazy, computational cost.
Techniques: Levenberg Marquardt,
Trust regions,
line search,
scaling and preconditioning, regularize parameters, soft constraints,
sparse methods,
Continuation Methods,
Paper on continuation methods,
Hand of God, allow constraint violations, add extra constraints,
Matlab recommendations

Feb 11:
Dynamic Programming.
Bellman equation,
slides

Feb 16:
Linear Quadratic Regulator,
Riccati Equation,
Differential Dynamic Programming

Feb 1823: Ways to reduce the curse of dimensionality
slides

Feb 23: Policy Optimization II: Optimization using modelbased gradients
slides

Feb 25: Robustness
Robustness to random disturbances, varying initial conditions, parametric
model error, structural modeling error such as
high frequency unmodelled dynamics,
and model jumps (touchdown and liftoff during walking, for example).
Monte Carlo trajectory/policy optimization.

Feb 25: Receding Horizon Control (a.k.a. Model Predictive Control (MPC)).

Feb 25: Robustness using Linear Matrix Inequalities
Robustness to parametric uncertainty in the linear(ized) model.
I can't find a good reference on robustness using linear matrix inequalities,
but here is a tutorial on LMIs

Feb 25: Robustness: Policy Optimization with Multiple Models.
MonteCarlo, DP, and DDP approaches to Multiple Models.

Mar 2:
State Estimation,
Uncertainty Propagation:
Gaussian Propagation (like Kalman Filter),
Unscented (like Unscented Filter), Second Order Kalman Filter (See Kendrick below).
Review of Gaussians slides
State estimation slides
Matlab Kalman filter example
and
minimum jerk trajectory subroutine.
Example mobile robot Kalman filter slides

March 4: No Class

March 911: No Class

March 16: Robustness and state estimation:
LinearquadraticGaussian control (LQG),
Separation principle, Certainty equivalence,
Example of bad interactions, Loop Transfer Recovery (LTR),
A paper on the topic,
Policy optimization approaches.

March 18:
Dual Control.
Simple example.
Information state DP.

March 18: Local Approaches to Dual Control/Stochastic DDP
Information state trajectory optimization.
Stochastic Control for Economic Models,
David Kendrick, Second Edition 2002.

March 23: A*like algorithms: R*

March 23: Optimizing Walking:
3D Walking Based on Online Optimization

March 25: Avoiding obstacles using Sampling based methods: RRT,
slides
Projected RRT,
RRT*
slides
video 1
video 2
LQRRRT*
Random Sampling DP

March 25: Avoiding obstacles using gradient methods: CHOMP
STOMP

March 30: Handling contact:
Posa, Tedrake,
Posa video
ContactInvariant Optimization
Hands
Legs
Trajectory Optimization for FullBody Movements with Complex Contacts

April 1: Learning From Demonstration

April 1: Inverse Optimal Control
Sergey Levine, Vladlen Koltun. Continuous Inverse Optimal Control with Locally Optimal Examples

April 6:
*** Generalizing Locomotion Style to New Animals With Inverse Optimal Regression

April 6: Reinforcement Learning: Model free policy optimization.
Kober, J.; Peters, J. (2011). Policy Search for Motor Primitives in Robotics, Machine Learning, 84, 12, pp.171203

April 6: Comparison of various RL methods: CMAES, CEM, PI2.
Freek Stulp and Olivier Sigaud. Path Integral Policy Improvement with Covariance Matrix Adaptation. In Proceedings of the 29th International Conference on Machine Learning (ICML), 2012.

April 8:
*** EnsembleCIO: FullBody Dynamic Motion Planning that Transfers to
Physical Humanoids
*** movie
*** Learning ContactRich Manipulation Skills with Guided Policy Search

April 13:
Trajectory optimization based on integrating the dynamics:
calculus of variations,
EulerLagrange equation,
Discrete time Pontryagin's minimum principle,
Pontryagin's minimum principle,
HamiltonJacobiBellman equation,
costate equations,
shooting methods,
multiple shooting methods,
KarushKuhnTucker conditions
Continuation Methods,
Metaoptimization,
Learning during optimization

April 15:
*** Online Motion Synthesis Using Sequential Monte Carlo
*** Learning Bicycle Stunts

April 20: Multirobot Systems:
Robot Swarms,
Neglect Benevolence,
Aligning Coordinate Frames in MultiRobot Systems,
Developing Aids to Optimize Human Interaction with Dynamical Systems
The material covered will come from the following papers.
Nagavalli, Sasanka, et al. "Neglect Benevolence in human control of
robotic swarms." Robotics and Automation (ICRA), 2014 IEEE
International Conference on. IEEE, 2014.
Nagavalli, Sasanka, et al. "Aligning coordinate frames in multirobot
systems with relative sensing information." Intelligent Robots and
Systems (IROS 2014), 2014 IEEE/RSJ International Conference on. IEEE,
2014.
Nagavalli, Sasanka, et al. "Bounds of Neglect Benevolence in Input
Timing for Human Interaction with Robotic Swarms." Proceedings of the
Tenth Annual ACM/IEEE International Conference on HumanRobot
Interaction. ACM, 2015.
A textbook for students to read more about distributed algorithms for
multirobot systems:
Bullo, Francesco, Jorge Cortis, and Sonia Martinez. Distributed
control of robotic networks: a mathematical approach to motion
coordination algorithms. Princeton University Press, 2009.

April 22: Convex optimization techniques for generating trajectories:
FLIGHT TESTING OF TRAJECTORIES COMPUTED BY GFOLD:
FUEL OPTIMAL LARGE DIVERT GUIDANCE ALGORITHM FOR
PLANETARY LANDING
Lossless Convexification of PoweredDescent Guidance with NonConvex
Thrust Bound and Pointing Constraints

April 22: Inverse Optimal Control
Ziebart, B. D., Maas, A. L., Bagnell, J. A., & Dey, A. K. (2008). Maximum
Entropy Inverse Reinforcement Learning. In AAAI (pp. 14331438).
Ziebart, B. D., Maas, A. L., Bagnell, J. A., & Dey, A. K. (2009). Human
Behavior Modeling with Maximum Entropy Inverse Optimal Control. In AAAI
Spring Symposium: Human Behavior Modeling (p. 92).

Apr. 27: Project presentations

Apr. 29: Project presentations

May. 10: Project Writeups Due
Other topics

Learning Heuristic Functions

SIMD parallelism
How do GPUs and other forms of SIMD parallelism affect optimization
algorithm design?
Assignments

Assignment 0 (Due Jan. 18): Send CGA email:
Who are you?
Why are you here?
What research do you do?
Describe any optimization you have done (point me to papers or
web pages if they exist).
Any project ideas?
What topics would you especially like the course to cover?
Be sure your name is obvious in the email, and you mention the course
name or number. I teach more than one course, and a random email from
robotlover@cs.cmu.edu is hard for me to process.

Assignment 1 (Due Feb. 15): Using Optimization
to do Inverse Kinematics

Assignment 2 (Due Mar. 15): Policy Optimization

Assignment 3 (Due Apr. 5): Planning Walking