MARS Final Technical Report

June, 2004
Principal Investigator
Prof. Manuela Veloso (mmv@cs.cmu.edu)
Address
Carnegie Mellon University
5000 Forbes Avenue
Pittsburgh, PA, 15213
ph: 412 268 6021
fax: 412 268 4801
Project URL
Primary web site:
http://www.cs.cmu.edu/~coral/MARS
Research web site: http://www.cs.cmu.edu/~coral
Robot Soccer web site
http://www.cs.cmu.edu/~robosoccer

Objective

Our DARPA MARS research effort has focused on the problem of robot autonomy in multi-robot team tasks set in highy dynamic, adversarial environments against unknown opponents. The research effort has been broad initiative, covering a range of domains from simulated robot agents through to fully individually autonomous robots with complex kinematic structure. Our primary focus has been to to develop complete, effective, and adaptive software for autonomous robot teams to enable the team to perform their complex real-time task while adapting their strategy, behavior, and control methods to the opponent team and environment. Moreover, our goal has been to develop techniques and algorithms that work -- meaning all of our approaches must be extensively validated in real-world tasks. Our evaluations have focused on testing the performance of our techniques within the domain of robot soccer, where teams of robots compete autonomously in games of soccer. Indeed, over the course of our research project, we have competed with our various teams in over 7 International and Regional competitions equating to over one hundred of games in different domains against teams using unknown strategies.

Approach

Our approach is characterized by a consistent focus on complete autonomous multi-robot teams operating in adversarial domains. Complete means that each robot must continuously perceive, think, and act within the team environment in real-time, where real-time means at least as fast as the opponents. This means each component of the robot's decision cycle must perform to expectations, as well as integrating into a seamless whole in order to achieve good team performance. The teams are fully autonomous, meaning that there is no human involved in the decision making cycle during task execution.

We divide our work into two main components developing individually skilled team members, and robots that are able to team. In each case, our work includes significant efforts to apply learning and adaptation throughout the control hierarchy in a practical and useful way: We first describe the domains we work with. All of our robot domains are based upon robot soccer, where teams of autonomous robots compete in a game of soccer. Under MARS DARPA funding we have investigated a range of domains including:

Domains

[Sony AIBO Legged League] The legged league consists of teams of Sony AIBOs, where each AIBO is autonomous and can communicate via 802.11b wireless ethernet. Each robot is a quadruped with 16 degrees of freedom, with a color camera as its primary perception mechanism. Each robot also has accelerometers to act as a G-sensor. The robots are unmodified, except for the software, which is written in C++ using the SDK made available by Sony to provide the low-level API for controlling the robot. The field is approximately 4m long and 3m wide and is surrounded by four color coded markers and two colored goals.

[Small-Size Robot League] Each team consists of five small robots, where the robots are custom built and must fit within a 18cm diameter cylinder that is at most 15cm tall. Robots use ball manipulation devices including a kicker, and a special purpose dribbling device to control the ball. Perception is provided by overhead cameras and control by an off-field computer. Small-size robots are therefore autonomous as a team. The ball is an orange golf ball, and the field is 4.9m long by 3.3m wide with no walls.

[Segway Soccer League] This is a new domain we have developed under the MARS program. It makes use of both humans and robots, where all platforms are based on the Segway mobility platforms. We have proposed a new league to entice other researchers to participate in this domain. The ball is a standard size 5 soccer ball, and the field scales with the number of players so that 11 vs 11 gives a full size soccer field.

[Simulation Robot and Coach League] Simulation soccer runs in a 2D or 3D simulator where each simulated robot is given  high level perceptual information at a predefined frame rate. The game uses dimensions comparable to humans playing on a soccer field. We also have a special coach competition whereby a coachable team is provided, and it is up to the coach to learn how to best provide advice to the team based on prior observation of the team and opponents.

Technical Approach

All multi-robot systems need team members with sufficient individual skills to allow the team to achieve its goals. We have focused our efforts on developing individually skilled robots that are able to participate in a team and adapt to their environments. Individual skills can be further broken into perception, cognition, and action, where learning can be applied to each component.

Perception

All of our robots utilize color vision as their primary sensing modality. In addition, depending upon the platform in question, the robots may make use of odometry information, joint angles and accelerometer (G-sensor) data. As with any sensor, information is gleaned from the world at a steady, pre-defined rate, but is corrupted by both systematic bias and noise of both Gaussian and non-Gaussian forms and suffers from unavoidable latency. The purpose of robot perception is to extract information from these noisy sensors, fuse information from different sensors (and perhaps from other robots) to provide a useful model of the current state of the world in order for the control hierarchy to make intelligent, informed decisions and act upon them. Thus, perception consists of raw sensory processing, followed by tracking and information fusion to obtain a model of the world state.

[Vision Based Perception] In prior work under the MARS program, we developed a fast color vision library called CMVision [Bruce et al. 00]. CMVision is now widely used within robotics and related communities, and is also used as a benchmark for comparing speeds of new color vision techniques. CMVision operates in a three step process: first each pixel is classified into a color class using a lookup table for speed, second neighboring pixels are collected to form contiguous regions using run length encoding and conneted component analysis, finally the regions themselves are analyzed using shape and geometric constraints to identify apriori known obstacles.

We have used CMVision in a variety of domains including the Sony AIBO's, which have on-board vision with limited CPU resources, and small-size which makes use of an overhead camera. In our recent work, we have extended CMVision to include unified multi-camera detection of a robot team from an overhead perspective. In particular, we have investigated techniques for handling control in the regions of overlap between different camera perspectives. This system was validated at RoboCup 2004 under ambient lighting conditions.

We have investigated alternative color vision based techniques with our Segway platforms. We have developed general region growing techniques based on a homogenity criteria: fixed color distance from the initial threshold. Following our prior work, color classification uses a lookup table on the average region color, and high level object detection uses geometric constraints. We are investigating techniques to enable the object detection to operate independent of illumination variation [Browning et al 04, and additional publications forthcoming].

[Multi-Object Tracking] We have developed robust Kalman filter-based multi-object tracking techniques. In earlier work under the MARS program we developed improbability filtering methods for reducing the rate of false-positives and their resulting detrimental effect on tracking [Browning et al. 02]. Recently, we have developed Kalman-filter based techniques for object tracking for robots with on-board perception for both the Sony AIBOs and Segway platforms [publication in preparation].

[Localization] In prior work we have developed robust extensions to Monte-Carlo Localization, called Sensor Resetting Localization (SRL) [Lenser & Veloso, 00]. We have recently revised and extended this approach to use a particle filter using a hybrid MCMC update technique, and metropolis sampling, derived from the physics and particle filter literature. This technique has been fully implemented on the Sony AIBO platforms and makes use of a wide range of visual features including field markers, goals, and field edges.

[Local Model] We have developed a local obstacle model for navigation derived from region edges and camera projective geometry. The model enables reactive navigation by providing a local map of the surrounding terrain that is updated in real-time in synchronicity with the robot's perception. Secondly, it enables action selection to be better informed by providing context on the surrounding environment to the decision making process. We have validated this technique in our Sony AIBO team at RoboCup events, as well as in controlled laboratory experiments [Fasola et al, 04, Lenser et al 03].

Single Robot Cognition and Action

[Skills, Tactics] We have developed a hierarchical control architecture called STP: Skills, Tactics, and Plays [Browning et al, 03, forthcoming publication]. Plays deal with team behavior, and are discussed below. Tactics and skills control single robot behavior. Each tactic specifies an augmented finite state machine (AFSM) of skills and an evaluation function. The AFSM defines what skills may be executed, where transitions between skills in the AFSM are a  function of the state of the world. The evaluation function determines what needs to be done to acheive the goal of the tactic and specifies the parameters that are passed to the executing skill. Each skill is a focused control policy, where focused means that it is only defined over a portion of the state space. This restriction means that when creating a skill, the designer need only worry about the behavior of the control policy over the the relevant states rather than in all possible circumstances. The hierarchical decomposition that STP provides facilitates easy development of new behaviors, or modification of pre-existing behaviors. Secondly, it provides a natural mechanism to seamlessly integrate learned skills with hand-coded ones. We discuss this more below.

[Real-time Path Planning] We have developed a real-time robot path planning technique, based on Rapidly Exploring Random Trees (RRTs) that plans sufficiently quickly to be applied to a 5 robot team which fully replans each robot path every frame at 30Hz [Bruce & Veloso, 03]. In more recent work, we have investigated a similar approach using probabilistic roadmap techniques to provide rapid replanning in a dynamic world while fulfilling an objective function (such as maintaining line of sight to a target or targets). We are investigating approaches to combine both techniques to obtain the benefits of both.

Individual Learning

[Learning Action Outcomes] We have developed techniques to build probablistic models of the outcomes of discrete actions (in this case kicks) in order to provide a predictive model to aid in robot decision making. Prior to task execution, each robot collects data on the outcome of each of its different actions. This data is used to build a statistical model of each action. During run-time execution, whenever a choice of different actions occurs, the robot analyzes the predicted outcomes and chooses the action that leads to the result with the highest utility. This work has been fully implemented and tested on the Sony AIBO platform in RoboCup competitions [Chernova & Veloso, 04].

[Learning Sensor States] Being able to robustly learn, classify, and recognize 'state context' from a stream of sensor measurements is one of the key challenges to robot intelligence. If a robot can recognize the state of the world, and select the appropriate controller, it can potentially extend the range of even standard control techniques to handle much more dynamic situations by switching between controllers appropriately. We have explored a number of techniques using non-parametric statistical time-series methods to learn and classify sensor state from a sequence of continuous, multi-dimensional sensor vectors. These techniques have been applied to the Sony AIBO platform under various conditions including variable lighting [Lenser & Veloso, 03], and G-sensor data when walking under different conditions [Vail & Veloso, 04].

[Motion Optimization] We have explored an approach to learning optimal motion parameters for a quadruped robot (a Sony AIBO). Our approach uses genetic algorithms to learn to optimize the parameters controlling the walk motion on a Sony AIBO [Chernova & Veloso, 04]. The resulting walk on an ERS-7 reaches speeds of 350mm/s, which is the fastest walk ever observed on a Sony AIBO. Optimizing walk parameters is a difficult process given the high dimensionality of the state space (approximately 12-24 dimensions depending upon which symmetries are enabled), and the amount of noise in each fitness evaluation.

[Skill Learning]
  We have developed new techniques based on function approximation to learn robot skills (in the context of the STP architecture) from a human teacher. The human teacher provides example traces of execution demonstrating the desired skill. The robot is then able to recall, and generalize, from these execution traces to execute a focused control policy. This technique works in conjunction with hand coded policies in a robot control architecture in a seamless manner [Browning, Xu, Veloso 04].

Team Cognition and Learning

[Plays for Coordination] As described above, we have developed a Skills, Tactics, Plays architecture (STP) for teamwork in adversarial environments [Browining et al 03, Bowling et al 04, additional publications forthcoming]. Plays are the component for team interaction. Each play is a fixed team plan that encodes a sequence of synchronized actions for each role in the team. Roles are assigned to players dynamically during execution, thereby allowing flexibility of execution. Plays are selective, in that they have applicability conditions defined on the world state that determine when a play can execute. Plays have termination criteria that decide when a play should stop executing, and also what is the result of the play upon termination. The latter is used to adapt future play selection as described below. We have developed a play language, which allows rapid development of new plays. Indeed, we have shown with our RoboCup experience that a completely new set of plays ie. a new team strategy, can be developed within an hour or so. This is significantly faster than any other mechanism currently present in the community.

[Pickup Teams] We have recently begun extending our STP architecture to the problem of pickup teams. A pickup team is where the heterogenous team forms with partial, or no apriori, information about each team mate. In such a situation the team needs to rapidly acquire knowledge about each teammate and negotiate what roles will be assigned on the team. We have taken the first step towards a pickup team framework by forming a joint team with the RoboDragons, from Aichi Prefectural University in Japan. Our teams joined, and competed at RoboCup 2004, with little apriori knowledge of each others frameworks except the agreed upon interface at the vision level and at the robot command level. Many issue were raised by this challenge that we are currently investigating.

[Token based Coordination] We have investigated a market based scheme combined with a token passing mechanism to coordinate role assignment  between distributed robots in a highly uncertain world. In particular, we have made use of our work with shared perception to enable the market based system to allow the bidding process to operate independently of the network communications, thus preserving bandwidth on high latency networks. This technique has been empirically validated on our Sony AIBO robots [Vail & Veloso 03].

[Coaching] We have developed a coach framework whereby an agent can observe a team performing a task, and then offer advice to help improve team performance. The coach generates advice for the team by observing the performance of the team, and other teams, in the past from game logs. The coach generates an abstract Markov Decision Process to describe the task faced by the team [Riley & Veloso 04]. By solving this Abstract MDP, the coach then provides useful information, in the form of an abstract policy, that the robots use during execution to make better decisions. We have validated this approach on many logs and simulation games obtaining statistically significant results that demonstrate the usefulness of the coach in improving team performance.

[Shared Perception] In our prior work under the MARS program we have developed probablistic techniques for fusing perceptual information from multiple robots into a shared world model. Moreover, we have investigated techniques for determining which information, local or shared, a robot should use in order to act using its best available knowledge [Roth, Vail, Veloso, 03].

[Opponent Modelling] We have developed techniques for building probabilistic models of an opponent team's strategy based on observing the history of the team's prior execution of the task [. Using this probablistic model, we have developed techniques to enable a coach agent to plan out a series of actions that best takes advantage of the opponent's weaknesses. The resulting plan is then monitored, with temporal constraints, to ensure correct execution and to the trigger the appropriate response in the event of failed execution. This work has been validated empirically in the simulation league [Riley & Veloso, 02].

Recent Accomplishments

Following our established practise, we have fully validated our work at the RoboCup 2004, and RoboCup US Open 2004. Our results have been fully published and presented to the community, and where possible technology transition efforts have been made. Here we list our recent accomplishments:

[RoboCup 2004] We participated at RoboCup 2004 in Lisbon Portugal. We entered our small-size robot team as a joint effort with RoboDragons from Aichi Preferctural University in order to investigate the concept of pickup teams. We entered our Sony AIBO team CMPack, and our simulation team CMLoki, and our coach team CMOwl. We also demonstrated our Segway RMP platform in order to promote the new domain of Segway Soccer to othe researchers.

[RoboCup USOpen 2004] We participated at the RoboCup USOpen 2004, held in New Orleans. We entered with our Sony AIBO team CMPack, winning the chamionship, and demonstrated our Segway RMP platform in conjunction with the Neurosciences Institute.

[Skill Optimization and Learning] We have developed new techniques for optimizing motion execution, and learning new skills for robot control. We have validated these techniques, and in the case of motion optimization, lead a community effort to develop robust optimization techniques applicable to robot control for quadruped robots.

[Coachable Teams] We have developed new techniques for coach agents based on learning an Abstract Markov Decision Process of team execution, and then solving this Abstract MDP to derive an optimal policy which is used to generate useful advice for each player in the team.

[Learning Sensor States] We have developed new classes of techniques for learning state from continous time-serious sensor signals. We have built upon our previosu work in this area with new techniques that extend into multiple dimensions with real-time data.

[Pickup Teams] We have begun the exploration of pickup teams, where heterogeneous teams form with partial knowledge of their teammates capabilities and algorithms.We empirically investigated this challenge with the RoboDragons from Aichi Prefectural University in Japan at the recent RoboCup 2004 event.

[DarpaTech] We demonstrated our techniques at the DarpaTech 2004 Symposium.

[US Army War College] We demonstrated our Segway platforms at the US Army War College during May, 2004.

[Segway for education]  Our work has lead to the use of the Segway RMP as the new robot education platform for Carnegie Mellon's Qatar initiative.

Technology Transition

Our work has lead to technology transitions in a number of situations. We detail some of these below, and our efforts to promote further transitions.
[Sony AIBO software] We have released all of our Sony AIBO software. This has been used by other researchers to provide known opponents for them to test their algorithms against. Additionally, it has provided the base for new research teams to develop their software from (e.g. Georgia Institute of Technology).

[CMRoboBits] Based on our Sony AIBO software, we have developed a complete course and simplified software package for teaching robotics to undergraduates and graduates. The complete CMRoboBits course notes and software are available on-line.

[Segway for education]  Our work has lead to the use of the Segway RMP as the new robot education platform for Carnegie Mellon's Qatar initiative.

[CMVision] Our fast color vision library has been available for some time now. It is widely used in the robot soccer research community, and in the wider research community. It is used within the autonomous foozball project (which is being commercialized), among others. It also provides the benchmark standard for comparing similar algorithms in terms of performance and speed.

Technology transition efforts:
[Software] All of our software is regularly released to the community and is available on-line.

[Publications] All publications from our group are available on-line for perusal.

[Multi-media] All movies documenting our research are available on-line. This promotes interest in our work from abroad leading to greater exposure with the community.

[Robot hardware] All small-size robot hardware is available on-line.

Relevent Publications

An Evolutionary Approach To Gait Learning For Four-Legged Robots,
Sonia Chernova and Manuela Veloso.
In Proceedings of IROS'04, Sendai, Japan, September 2004.

State Identification From Robot Sensors Using Non-Parametric Statistics,
Scott Lenser and Manuela Veloso.
In Proceedings of IROS'04, Sendai, Japan, September 2004.

Turning Segways into Soccer Robots,
Jeremy Searock, Brett Browning, and Manuela Veloso.
In Proceedings of IROS'04, Sendai, Japan, September 2004.

Skill Acquisition and Use for a Dynamically-Balancing Soccer Robot,
Brett Browning, Ling Xu, and Manuela Veloso.
In Proceedings of the Nineteenth National Conference on Artificial Intelligence (AAAI-04), San Jose, July 2004.

Advice Generation from Observed Execution: Abstract Markov Decision Process Learning,
Patrick Riley and Manuela Veloso.
In Proceedings of the Nineteenth National Conference on Artificial Intelligence (AAAI-04), San Jose, July 2004.

Accurate and flexible simulation for dynamic, vision-centric robots,
Jared Go, Brett Browning, and Manuela Veloso.
In Proceedings of The Third International Joint Conference on Autonomous Agents and Multi Agent Systems (AAMAS04), New York, July 2004.

Learning from accelerometer data on a legged robot,
Douglas Vail and Manuela Veloso.
In Proceedings of the 5th IFAC/EURON Symposium on Intelligent Autonomous Vehicles (IAV2004), Lisbon, Portugal, July 2004.

CommLang: Communication for Coachable Agents,
John Davin, Patrick Riley, and Manuela Veloso.
In Proceedings of the RoboCup International Symposium, Lisbon, Portugal, July 2004.

Segway CM-RMP Robot Soccer Player
Jeremy Searock, Brett Browning, and Manuela Veloso.
In Proceedings of the RoboCup International Symposium, Lisbon, Portugal, July 2004.

Plays as team plans for coordination and adaptation,
Michael Bowling, Brett Browning, and Manuela Veloso.
In Proceedings of the 14th International Conference on Automated Planning and Scheduling (ICAPS-04), Vancouver, June 2004.

Development of a soccer-playing dynamically-balancing mobile robot,
Brett Browning, Paul Rybski, Jeremy Searock, and Manuela Veloso.
In Proceedings of ICRA-2004, New Orleans, May 2004.

Learning and using models of kicking motions for legged robots,
Sonia Chernova and Manuela Veloso.
In Proceedings of ICRA-2004, New Orleans, May 2004.

CMRoboBits: Creating and Intelligent AIBO Robot,
Manuela Veloso, Scott Lenser, Douglas Vail, Paul E. Rybski, Nick Aiwazian, and Sonia Chernova.
In Proceedings of the AAAI Spring Symposium on Accessible Hands-on Artificial Intelligence and Robotics Education, Stanford, March 2004.

Relevent Video Footage

A number of videos demonstrating the research we have conducted under the MARS program are available on-line. These videos were taken from various RoboCup competitions, various demonstrations, as well as within our laboratories under experimental conditions. We have categorized these videos based on the specific domain. Unless specifically noted, all videos show robots operating autonomously:

Sony Aibo videos: http://www.cs.cmu.edu/~robosoccer/legged/movies/

Small-size videos: http://www.cs.cmu.edu/~robosoccer/small/movies/

Segway videos: http://www.cs.cmu.edu/~robosoccer/segway/movies/


Manuela Veloso and Manuela Veloso
Last modified: Thu Aug 1 2004