Being able to learn or plan how to do challenging dynamic tasks on high dimensional humanoid robots is a major challenge. Atkeson has emphasized the close links between model-based reinforcement learning, optimization, and planning for dynamic tasks. Dynamic programming provides a methodology for developing planners and controllers for nonlinear systems. However, general dynamic programming is computationally intractable. We have developed procedures that allow complex planning and control problems to be solved in a reasonable amount of time, using trajectory libraries. We use second order local plan optimization to generate locally optimal plans and local models of the value function and its derivatives. We combine many local plans to build more global plans. We maintain global consistency of the local models of the value function, guaranteeing that our locally optimal plans are actually globally optimal, up to the resolution of the search procedures.
Trajectory-Based Dynamic Programming, C. G. Atkeson and C. Liu, in press.
Random Sampling of States in Dynamic Programming, C. G. Atkeson and B. Stephens, in IEEE Transactions on Systems, Man, and Cybernetics - Part B: Cybernetics, Vol. 38, No. 4, pp. 924-929, 2008.
Using Local Trajectory Optimizers to Speed Up Global Optimization in Dynamic Programming, C. G. Atkeson, Proceedings, Neural Information Processing Systems, Denver, Colorado, December, 1993, In: Neural Information Processing Systems 6, J. D. Cowan, G. Tesauro, and J. Alspector, eds. Morgan Kaufmann, 1994. Citeseer entry.
Nonparametric Representation of Policies and Value Functions: A Trajectory-Based Approach, C. G. Atkeson, and J. Morimoto, Proceedings, Neural Information Processing Systems, Denver, Colorado, December, 2002, In: Neural Information Processing Systems 15, MIT Press, 2003. Citeseer entry.
Morimoto and Atkeson have developed robust versions of the local trajectory planner. [IROS 2003]
Standing balance control using a trajectory library, Liu, Chenggang; Atkeson, Christopher G.; IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), 2009, Pages: 3031 - 3036.
M. Stolle and C. G. Atkeson, ``Finding and transferring policies using stored behaviors,'' Autonomous Robots, 29(2): 169-200, 2010.
M. Stolle, H. Tappeiner, J. Chestnutt, and C. G. Atkeson, ``Transfer of policies based on trajectory libraries'', IEEE/RSJ International Conference on Intelligent Robots and Systems, 4234-4240, 2007.
M. Stolle and C. G. Atkeson ``Policies Based on Trajectory Libraries'', IEEE International Conference on Robotics and Automation, Page(s):3344-3349, 2006.
Learning Control in Robotics, S. Schaal and C. G. Atkeson, IEEE Robotics & Automation Magazine, 17, 20-29, 2010.
Applying local learning to robot learning:
Schaal, S., and C. G. Atkeson,
Robot Juggling: An Implementation of Memory-based Learning, Control Systems Magazine, 14(1):57-71, 1994.
A book describing parametric model-based learning for robots:
C. H. An, C. G. Atkeson, and J. M. Hollerbach,
Model-Based Control of a Robot Manipulator, MIT Press, Cambridge, Massachusetts, 1988.
How can a robot learn from watching a human?
Atkeson, C. G. and S. Schaal
Robot Learning From Demonstration, Machine Learning: Proceedings of the Fourteenth International Conference (ICML '97), Edited by Douglas H. Fisher, Jr. pp. 12-20, Morgan Kaufmann, San Francisco, CA, 1997. gzipped postscript
An overview of work on local learning algorithms is given by:
Atkeson, C. G., Moore, A. W., & Schaal, S.
"Locally Weighted Learning." Artificial Intelligence Review, 11:11-73, 1997.
An overview of local learning applied to robots is given by:
Atkeson, C. G., Moore, A. W., & Schaal, S.
"Locally Weighted Learning for Control." Artificial Intelligence Review, 11:75-113, 1997.
Looking at local learning from a neural network point of view:
Atkeson, C. G., and S. Schaal,
Memory-Based Neural Networks For Robot Learning, Neurocomputing, 9(3):243-69, 1995.
A mixture of experts approach to local learning is presented in:
Schaal, S., & Atkeson, C. G.
From Isolation to Cooperation: An Alternative View of a System of Experts In: D.S. Touretzky, and M.E. Hasselmo (Eds.), Advances in Neural Information Processing Systems 8. Cambridge, MA: MIT Press. 1996.
Stefan Schaal, Atkeson, and colleagues have explored new approaches to nonparametric learning, Receptive Field Weighted Regression (RFWR) and Locally Weighted Projection Regression (LWPR), in which receptive fields representing local models are created and maintained during learning. These approaches provide an interesting alternative perspective on locally weighted learning. Unlike the original version of locally weighted learning, these approaches maintain local intermediate data structures such as receptive fields. [Applied Intelligence 2002] [ICRA 2000] [Neural Computation 1998] [NIPS 1997]
Schaal, S., D. Sternad and C. G. Atkeson,
One-handed Juggling: Dynamical Approaches to a Rhythmic Movement Task, Journal of Motor Behavior, 28(2):165-183, 1996.
Schaal, S. and C. G. Atkeson,
Open Loop Stable Control Strategies for Robot Juggling, In: IEEE International Conference on Robotics and Automation, Vol.3, pp.913-918, Atlanta, Georgia, 1993.
An overview of the work at ATR on humanoid robots is given in Atkeson, et al., "Using Humanoid Robots to Study Human Behavior", IEEE Intelligent Systems, 15(4):46-56, 2000).