The Skinnerbots Project
We are developing computational theories of operant conditioning. While
classical (Pavlovian) conditioning has a well-developed theory, implemented
in the Rescorla-Wagner model and its descendants (work by Sutton &
Barto, Grossberg, Klopf, Gallistel, and others), there is at present no
comprehensive theory of operant conditioning. Our work has four components:
- Develop computationally explicit models of operant conditioning that
reproduce classical animal learning experiments with rats, dogs, pigeons,
- Demonstrate the workability of these models by implementing them
on mobile robots, which then become trainable robots
(Skinnerbots). We originally used Amelia, a B21
robot manufactured by Real World
Interface, as our implementation platform. We are moving to the
- Map our computational theories onto neuroanatomical structures known
to be involved in animal learning, such as the hippocampus, amygdala, and
- Explore issues in human-robot interaction that arise when
non-scientists try to train robots as if they were animals.
Publications and News Reports
N.D. Daw, A.C. Courville, and D.S. Touretzky (in press) Timing and partial observability in the
dopamine system. In S. Becker, S. Thrun, and K. Obermayer (Eds.),
Advances in Neural Information Processing Systems 15.
Cambridge, MA: MIT Press.
N.D. Daw and D.S. Touretzky (2002) Long-term reward
prediction in TD models of the dopamine system. Neural
Computation, 14(11), 2567-2583.
D.S. Touretzky, N.D. Daw, and E.J. Tira-Thompson (2002) Combining configural and TD learning
on a robot. Proceedings of the Second International Conference on
Development and Learning, pp. 47-52. Cambridge, MA, June 12-15. IEEE
N.D. Daw, A.C. Courville, and D.S. Touretzky (2002) Dopamine and inference about timing.
Proceedings of the Second International Conference on Development and
Learning, pp. 271-276. Cambridge, MA, June 12-15. IEEE Computer
Society. gzipped Postscript)
- A.C.Courville and D.S. Touretzky. (2002) Modeling temporal structure in
classical conditioning. In T. Dietterich, S. Becker, and
Z. Ghahramani (Eds.), Advances in Neural Information Processing
Systems 14. Cambridge, MA: MIT Press. (gzipped Postscript)
- N.D.Daw and D.S. Touretzky (2001) Operant behavior
suggests attentional gating of dopamine system inputs.
- Daw, N. D., and Touretzky, D. S. (2000) Behavioral
considerations suggest an average reward TD model of the dopamine
system. Neurocomputing, 32:679-684.
- L.M. Saksida, S.M. Raymond, and D.S. Touretzky (1998) Shaping robot behavior using principles
from instrumental conditioning. Robotics and Autonomous
- D.S. Touretzky and L.M. Saksida (1997) Operant conditioning in
Skinnerbots. Adaptive Behavior 5(3/4):219-247.
- Saksida, L.M. and Touretzky, D.S. (1997). Application of a model
of instrumental conditioning to mobile robot control. In: Paul
S. Schenker and Gerard T. McKee (Eds.) Sensor Fusion and
Decentralized Control in Autonomous Robotic Systems. SPIE
vol. 3209. pp. 55-66.
- D.S. Touretzky and L.M. Saksida (1996) Skinnerbots. In P. Maes, M.
Mataric, J.-A. Meyer, J. Pollack, and S. W. Wilson (eds.), From Animals
to Animats 4: Proceedings of the Fourth International Conference on
Simulation of Adaptive Behavior, pp. 285-294. Cambridge, MA: MIT
- Robots trained to think like
rats-honest. Pittsburgh Post-Gazette, Monday, Dec. 23, 1996, pp. A14-15.
By Byron Spice. Online version.
- Dave Touretzky, principal investigator
- Lisa Saksida, psychology and modeling
- Nathaniel Daw, neural modeling;
"Spine" control program development
- Scott Raymond, vision and manipulation
- David Tolliver, sound generation
- David Gauthier, Spine prototype
- Greg Armstrong, robot maintainer
- Barbara Anderson, costume design and construction
- Dan Wood, audio/LED interface
Greg Armstrong rewards the robot by
pressing a button on the Logitech radio trackball in his right hand.
Animal Training/Animal Behavior/Robotics Links
Acknowledgments A project of the Computer Science
Department, Robotics Institute, and Center for the Neural Basis of
Cognition at Carnegie Mellon University. This material is based upon
work supported by the National Science Foundation under Grant
No. 9978403. Any opinions, findings, and conclusions or
recommendations expressed in this material are those of the author(s)
and do not necessarily reflect the views of the National Science
Foundation. Amelia robot provided courtesy of Reid Simmons and the Xavier group.