Peter's Thesis Page

My dissertation is available as CMU Computer Science Tech Report CMU-CS-98-187:

Layered Learning in Multi-Agent Systems

Thesis Committee
Manuela M. Veloso, Chair
Andrew W. Moore
Herbert A. Simon
Victor R. Lesser (University of Massachusetts, Amherst)

An updated version is available as a book from MIT Press

Layered Learning in Multiagent Systems:
A Winning Approach to Robotic Soccer

by Peter Stone

MIT Press, 2000.
ISBN: 0262194384

Abstract

Multi-agent systems in complex, real-time domains require agents to act effectively both autonomously and as part of a team. This dissertation addresses multi-agent systems consisting of teams of autonomous agents acting in real-time, noisy, collaborative, and adversarial environments. Because of the inherent complexity of this type of multi-agent system, this thesis investigates the use of machine learning within multi-agent systems. The dissertation makes four main contributions to the fields of Machine Learning and Multi-Agent Systems.

First, the thesis defines a team member agent architecture within which a flexible team structure is presented, allowing agents to decompose the task space into flexible roles and allowing them to smoothly switch roles while acting. Team organization is achieved by the introduction of a locker-room agreement as a collection of conventions followed by all team members. It defines agent roles, team formations, and pre-compiled multi-agent plans. In addition, the team member agent architecture includes a communication paradigm for domains with single-channel, low-bandwidth, unreliable communication. The communication paradigm facilitates team coordination while being robust to lost messages and active interference from opponents.

Second, the thesis introduces layered learning, a general-purpose machine learning paradigm for complex domains in which learning a mapping directly from agents' sensors to their actuators is intractable. Given a hierarchical task decomposition, layered learning allows for learning at each level of the hierarchy, with learning at each level directly affecting learning at the next higher level.

Third, the thesis introduces a new multi-agent reinforcement learning algorithm, namely team-partitioned, opaque-transition reinforcement learning (TPOT-RL). TPOT-RL is designed for domains in which agents cannot necessarily observe the state changes when other team members act. It exploits local, action-dependent features to aggressively generalize its input representation for learning and partitions the task among the agents, allowing them to simultaneously learn collaborative policies by observing the long-term effects of their actions.

Fourth, the thesis contributes a fully functioning multi-agent system that incorporates learning in a real-time, noisy domain with teammates and adversaries. Detailed algorithmic descriptions of the agents' behaviors as well as their source code are included in the thesis.

Empirical results validate all four contributions within the simulated robotic soccer domain. The generality of the contributions is verified by applying them to the real robotic soccer, and network routing domains. Ultimately, this dissertation demonstrates that by learning portions of their cognitive processes, selectively communicating, and coordinating their behaviors via common knowledge, a group of independent agents can work towards a common goal in a complex, real-time, noisy, collaborative, and adversarial environment.

Thesis

All files below are gzipped Postscript and formatted for double-sided printing.
Uncompressed Postscript and PDF versions are available from the CS tech reports page.

The complete dissertation in one file:

thesis.ps.gz (253 pages)

    
Each section in its own file:

Abstract and Contents (pp. 1-18)

Chapter 1: Introduction  (pp. 19-24)
            1.1  Motivation
            1.2  Objectives and Approach
            1.3  Contributions
            1.4  Reader's Guide to the Thesis

Chapter 2: Substrate Systems  (pp. 25-52)
            2.1  Overview
	    2.2  The RoboCup Soccer Server
	    2.3  The CMUnited-97 Real Robots
	    2.4  Network Routing

Chapter 3: Team Member Agent Architecture  (pp. 53-90)
            3.1  Periodic Team Synchronization (PTS) Domains
	    3.2  Architecture Overview
	    3.3  Teamwork Structure
	    3.4  Communication Paradigm
	    3.5  Implementation in Robotic Soccer
	    3.6  Results
	    3.7  Transfer to Real Robots
	    3.8  Discussion and Related Work

Chapter 4: Layered Learning  (pp. 91-104)
            4.1  Principles
	    4.2  Instantiation in Simulated Robotic Soccer
	    4.3  Discussion
	    4.4  Related Work

Chapter 5: Learning an Individual Skill  (pp. 105-114)
            5.1  Ball Interception in the Soccer Server
	    5.2  Training
	    5.3  Results
	    5.4  Discussion
	    5.5  Related Work

Chapter 6: Learning a Multi-Agent Behavior  (pp. 115-134)
            6.1  Decision Tree Learning for Pass Evaluation
	    6.2  Using the Learned Behaviors
	    6.3  Scaling up to Full Games
	    6.4  Discussion
	    6.5  Related Work

Chapter 7: Learning a Team Behavior  (pp. 135-168)
            7.1  Motivation
	    7.2  TPOT-RL
	    7.3  TPOT-RL Applied to Simulated Robotic Soccer
	    7.4  TPOT-RL Applied to Network Routing
	    7.5  Discussion
	    7.6  Related Work

Chapter 8: Competition Results  (pp. 169-180)
            8.1  Pre-RoboCup-96
	    8.2  MiroSot-96
	    8.3  RoboCup-97
	    8.4  RoboCup-98
	    8.5  Lessons Learned from Competitions

Chapter 9: Related Work  (pp. 181-208)
            9.1  MAS from an ML Perspective
	    9.2  Robotic Soccer

Chapter 10: Conclusion  (pp. 209-214)
            10.1  Contributions
	    10.2  Future Directions
	    10.3  Concluding Remarks

Appendices (pp. 215-234)
            A  List of Acronyms
	    B  Robotic Soccer Agent Skills
	                 B.1  CMUnited-98 Simulator Agent Skills
		         B.2  CMUnited-97 Small-Robot Skills
	    C  CMUnited-98 Simulator Team Behavior Modes
	                 C.1  Conditions
		         C.2  Effects
	    D  CMUnited Simulator Team Source Code

Bibliography (pp. 235-253)

On-line Appendix

This is the on-line appendix of my dissertation (Appendix D).
It contains source code and executables of the CMUnited-97 and CMUnited-98 simulator teams.
Details regarding the contents of the files are available on the respective team pages ( CMUnited-97 and CMUnited-98).

  • CMUnited-97 source code.
  • CMUnited-97 executable for SunOS.
  • CMUnited-98 source code.
  • CMUnited-98 executable for Linux and SunOS.

  • Peter Stone
    Go back to my homepage
    Send me mail