CMUnited-98 is the RoboCup-98 Simulator League World Champion Team!
Communication | SPAR | PLOS | TPOT-RL | Software
The CMUnited-98 simulator team uses the following novel multi-agent techniques to achieve adaptive coordination:
(2) Flexible, adaptive formations (Locker-room agreement)
(3) Single-channel, low-bandwidth communication
(4) Predictive, locally optimal skills (PLOS)
(5) Strategic positioning using attraction and repulsion (SPAR)
Layered learning is a hierarchical machine learning technique in which lower-level machine learning modules create the action or input spaces of higher-level learning modules. Developed by Peter Stone as a part of his thesis research, layered learning enables successful generalization in high-dimensional, complex spaces.
Layered learning does not automate the process of choosing the hierarchical learning layers or methods. However, with appropriate task decompositions and learning methods, powerful generaliztion is possible. For example, in the robotic soccer domain, we have linked the following three learned layers:
Applied Artificial Intelligence (AAI), Volume 12, 1998.
We characterize robotic soccer as an instance of a class of domains called Periodic Team Synchronization (PTS) domains. In this class of domains, a team of agents has periodic opportunities to communicate fully in a safe, off-line situation (i.e. in the "locker-room"). However, in general the agents must act autonomously in real-time with little or no communication possible.
To deal with the challenges of PTS domains, we introduce the concept of a Locker-Room Agreement by which agents determine ahead of time their communication language, their sensory triggers for changes in team strategy, and some multi-agent plans for predictable situations.
In CMUnited-98, the locker-room agreement includes a flexible team structure that allows homogeneous agents to switch roles (positions such as defender or attacker) within a single formation. It also allows the entire team to switch formations (for instance from a defensive to an offensive formation) based on agreed-upon sensory triggers. For example, CMUnited-98 began all of its games in a 4-3-3 formation (4 defenders, 3 midfielders, 3 forwards). However, if they had ever found themselves losing near the end of the game, they would have smoothly switched to a formation with fewer defenders and more forwards. In the actual competition, they often switched to a defensive formation with additional defenders and fewer forwards once they were safely in the lead.
The locker-room agreement also facilitates set-plays, or precompiled multi-agent plans for frequent situations such as kick-offs, goal-kicks, and corner-kicks. While many teams had trouble clearly the ball from the defensive zone after the goalie caught the ball, CMUnited-98 successfully used a sequence of passes to clear the ball first to the side of the field and then up the sideline.
Finally, the locker-room agreement defines the agent communication protocol as described below.
The locker-room agreement was also a key factor in the success of the world champion CMUnited-97 small robot team. It is described most completely in the following publication:
In Artificial Intelligence (AIJ), 1999.
Single-channel, low-bandwidth communication|
The Soccer Server's communication model is very good for examining single-channel, low-bandwidth communication environments. All agents must broadcast their messages on a single channel so that nearby agents on both teams can hear; there is a limited range of communication; and there is a limited hearing capacity so that message transmission is unreliable.
Challenges to overcome include robustness to lost messages; active interference by opponents; messages requiring multiple simulatneous responses by several teammates; and message targetting. Our approach was successfully implemented and used by CMUnited-98 agents to share state information and to coordinate formation changes via the locker-room agreement.
For details, see:
In Artificial Intelligence (AIJ), 1999.
In both the CMUnited-97 simulator and small robot teams, agents were organized in a team structure based on flexible roles and formations, which represented an effective and solid novel core team architecture for multi-agent systems. In the 1997 teams, an agent's positioning within its role depended only on the ball's location.
One of the most significant improvements of the CMUnited-98 small robot and simulator teams was an algorithm for sophisticated reasoning about the positioning of an agent when it does not have the ball in anticipation of a pass from a teammate who is in control of the ball. The agent's positioning is based upon teammate and adversary locations, as well as the ball's and the attacking goal's location. Jointly with the small robot team, we developed a new algorithm, SPAR (Strategic Positioning using Attraction and Repulsion), that determines the optimal positioning as the solution to a linear-programming based optimization problem with a multiple-objective function subject to several constraints.
The objectives include maximizing the distance to all the opponents and own teammates (repulsion), and minimizing the distance to the opponent goal and to the current position of the ball (attraction). The constraints reflected the particular setup of the games. In the CMUnited-98 simulator team, SPAR uses the following constraints:
For further details, see:
Submitted to Third International Conference on Autonomous Agents (Agents'99)
Another significant improvement of the CMUnited-98 over the CMUnited-97 simulator teams is the addition of PLOS: Predictive, Locally Optimal Skills. Locally optimal both in time and in space, PLOS was used to create sophisticate low-level behaviors, including:
In "RoboCup-98: Robot Soccer World Cup II", M. Asada and H. Kitano (eds.), 1999. Springer Verlag, Berlin.
We used our CMUnited-97 software to devise and study a new multi-agent reinforcement learning algorithm called Team-Partitioned Opaque-Transition Reinforcment Learning, or TPOT-RL. TPOT-RL introduces the use of action-dependent features to generalize the state space. In our work, we use a learned action-dependent feature space to aid higher-level reinforcement learning. TPOT-RL is an effective technique to allow a team of agents to learn to cooperate towards the achievement of a specific goal. It is an adaptation of traditional RL methods that is applicable in complex, non-Markovian, multi-agent domains with large state spaces and limited training opportunities.
In our experiments, we used TPOT-RL to train the passing and shooting patterns of a team of agents in fixed positions with no dribbling capabilities. We were able to achieve better results through learning than when using a fixed heuristic policy.
However, since CMUnited-98 used more flexible positioning and introduced a dribbling behavior, the results did not apply directly. We are currently looking into appllying TPOT-RL in other domains, such as network routing.
For details, see:
In Conference on Automated Learning and Discovery (CONALD)
You can now download a portion of the CMUnited-98 source code. It is written in C++ and has been tested under Linux and under SunOS.
We made some effort to package the code in such a way that people will be able to learn from and incorporate it. The README file included in the directory describes how to compile and run the code. You should be able to easily produce dribbling, kicking, and ball interception behaviors.
Included with the source code release is a paper describing our
control algorithms in detail:
The purpose of the code release is to allow people to get quickly past the low-level implementation details involved in working with the soccer server. Our high-level behaviors, including machine learning modules, a teamwork construct, and a communication paradigm, are best described in our various papers (see above).
Please note that the code is released as is, with no support provided. Also, please keep track of what code and ideas of ours you use.
You may also run CMUnited-98 as it ran in Paris.
Simulator Team Homepage | RoboSoccer Project Homepage | Computer Science Department | School of Computer Science