Bibliography

Arkin, 1998
Arkin, R. C. (1998).
Behavior-Based Robotics.
Intelligent Robotics and Autonomous Agents. The MIT Press.

Bellman, 1957
Bellman, R. E. (1957).
Dynamic Programming.
Princeton University Press, Princeton.

Boutilier et al., 1999
Boutilier, C., Dean, T., and Hanks, S. (1999).
Decision-theoretic planning: Structural assumptions and computational leverage.
Journal of Artificial Intelligence Research, 11:1-94.

Brooks, 1991
Brooks, R. A. (1991).
Intelligence without representation.
Artificial Intelligence, 47:139-159.

Butz, 1999
Butz, M. (1999).
C-XCS: An implementation of the XCS in C.
(http://www.cs.bath.ac.uk/ amb/LCSWEB/computer.htm).

Celaya and Porta, 1996
Celaya, E. and Porta, J. M. (1996).
Control of a six-legged robot walking on abrupt terrain.
In Proceedings of the IEEE International Conference on Robotics and Automation, pages 2731-2736.

Celaya and Porta, 1998
Celaya, E. and Porta, J. M. (1998).
A control structure for the locomotion of a legged robot on difficult terrain.
IEEE Robotics and Automation Magazine, Special Issue on Walking Robots, 5(2):43-51.

Chapman and Kaelbling, 1991
Chapman, D. and Kaelbling, L. P. (1991).
Input generalization in delayed reinforcement learning: An algorithm and performance comparisons.
In Proceedings of the International Joint Conference on Artificial Intelligence, pages 726-731.

Claus and Boutilier, 1998
Claus, C. and Boutilier, C. (1998).
The dynamics of reinforcement learning in cooperative multiagent systems.
In Proceedings of the Fifteenth National Conference on Artificial Intelligence, pages 746-752. American Association for Artificial Intelligence.

Drummond, 2002
Drummond, C. (2002).
Accelerating reinforcement learning by composing solutions of automatically identified subtasks.
Journal of Artificial Intelligence Research, 16:59-104.

Edelman, 1989
Edelman, G. M. (1989).
Neuronal Darwinism.
Oxford University Press.

Hinton et al., 1986
Hinton, G., McClelland, J., and Rumelhart, D. (1986).
Parallel Distributed Processing: Explorations in the Microstructure of Cognition. Volume 1: Foundations, chapter Distributed Representations.
MIT Press, Cambridge, MA.

Ilg et al., 1997
Ilg, W., Mühlfriedel, T., and Berns, K. (1997).
Hybrid learning architecture based on neural networks for adaptive control of a walking machine.
In Proceedings of the 1997 IEEE International Conference on Robotics and Automation, pages 2626-2631.

Kaelbling, 1993
Kaelbling, L. P. (1993).
Learning in Embedded Systems.
A Bradford Book. The MIT Press, Cambridge MA.

Kaelbling et al., 1996
Kaelbling, L. P., Littman, M. L., and Moore, A. W. (1996).
Reinforcement learning: A survey.
Journal of Artificial Intelligence Research, 4:237 - 285.

Kanerva, 1988
Kanerva, P. (1988).
Sparse Distributed Memory.
MIT Press, Cambridge, MA.

Kirchner, 1998
Kirchner, F. (1998).
Q-learning of complex behaviors on a six-legged walking machine.
Robotics and Autonomous Systems, 25:253-262.

Kodjabachia and Meyer, 1998
Kodjabachia, J. and Meyer, J. A. (1998).
Evolution and development of modular control architectures for 1-d locomotion in six-legged animats.
Connection Science, 2:211-237.

Maes and Brooks, 1990
Maes, P. and Brooks, R. A. (1990).
Learning to coordinate behaviors.
In Proceedings of the AAAI-90, pages 796-802.

Mahadevan and Connell, 1992
Mahadevan, S. and Connell, J. H. (1992).
Automatic programming of behavior-based robots using reinforcement learning.
Artificial Intelligence, 55:311-363.

McCallum, 1995
McCallum, A. K. (1995).
Reinforcement Learning with Selective Perception and Hidden State.
PhD thesis, Department of Computer Science.

Parker, 2000
Parker, G. B. (2000).
Co-evolving model parameters for anytime learning in evolutionary robotics.
Robotics and Autonomous Systems, 33:13-30.

Pendrith and Ryan, 1996
Pendrith, M. D. and Ryan, M. R. K. (1996).
C-trace: A new algorithm for reinforcement learning of robotic control.
In Proceedings of the 1996 International Workshop on Learning for Autonomous Robots (Robotlearn-96).

Poggio and Girosi, 1990
Poggio, T. and Girosi, F. (1990).
Regularization algorithms for learning that are equivalent to multilayer networks.
Science, (247):978-982.

Schmidhuber, 2002
Schmidhuber, J. (2002).
The speed prior: A new simplicity measure yielding near-optimal computable predictions.
In Proceedings of the 15th Annual Conference on Computational Learning Theory (COLT 2OO2). Lecture Notes In Artificial Intelligence. Springer., pages 216-228.

Sen, 1994
Sen, S. (1994).
Learning to coordinate without sharing information.
In Proceedings of the Twelfth National Conference on Artificial Intelligence, pages 426-431. American Association for Artificial Intelligence.

Sutton, 1996
Sutton, R. (1996).
Generalization in reinforcement learning: Successful examples using sparse coarse coding.
In Proceedings of the 1995 Conference on Advances in Neural Information Processing, pages 1038-1044.

Sutton et al., 1999
Sutton, R., Precup, D., and Singh, S. (1999).
Between MDPs and semi-MDPs: A framework for temporal abstraction in reinforcement learning.
Artificial Intelligence, 12:181-211.

Sutton, 1991
Sutton, R. S. (1991).
Reinforcement learning architectures for animats.
In Meyer, J. A. and Wilson, S. W., editors, Proceedings of the First International Conference on Simulation of Adaptive Behavior. From Animals to Animats, pages 288-296. The MIT Press, Bradford Books.

Sutton and Barto, 1998
Sutton, R. S. and Barto, A. G. (1998).
Reinforcement Learning: An Introduction.
A Bradford Book. The MIT Press.

Sutton and Whitehead, 1993
Sutton, R. S. and Whitehead, S. D. (1993).
Online learning with random representations.
In Proceedings of the Eleventh International Conference on Machine Learning, pages 314-321. Morgan Kaufman, San Francisco, CA.

Tan, 1997
Tan, M. (1997).
Multi-agent reinforcement learning: Independent vs. cooperative agents.
In Reading in Agents, pages 487-494. Morgan Kaufmann Publishers Inc.

Vallejo and Ramos, 2000
Vallejo, E. E. and Ramos, F. (2000).
A distributed genetic programming architecture for the evolution of robust insect locomotion controllers.
In Meyer, J. A., Berthoz, A., Floreano, D., Roitblat, H. L., and Wilson, S. W., editors, Supplement Proceedings of the Sixth International Conference on Simulation of Adaptive Behavior: From Animals to Animats, pages 235-244. The International Society for Adaptive Behavior.

Venturini, 1994
Venturini, G. (1994).
Apprentissage Adaptatif et Apprentissage Supervisé par Algorithme Génétique.
PhD thesis.

Watkins and Dayan, 1992
Watkins, C. J. C. H. and Dayan, P. (1992).
Q-learning.
Machine Learning, 8:279-292.

Widrow and Hoff, 1960
Widrow, B. and Hoff, M. (1960).
Adaptive switching circuits.
In Western Electronic Show and Convention, Volume 4, pages 96-104. Institute of Radio Engineers (now IEEE).

Wilson, 1995
Wilson, S. W. (1995).
Classifier fitness based on accuracy.
Evolutionary Computation, 3:149-175.

Wilson, 1996
Wilson, S. W. (1996).
Explore/exploit strategies in autonomy.
In From Animals to Animats 4: Proceedings of the 4th International Conference on Simulation of Adaptive Behavior, pages 325-332.



Josep M Porta 2005-02-17