MARS Final Technical Report
June, 2004
Principal
Investigator
|
Prof. Manuela Veloso
(mmv@cs.cmu.edu)
|
Address
|
Carnegie Mellon University
5000 Forbes Avenue
Pittsburgh, PA, 15213
ph: 412 268 6021
fax: 412 268 4801
|
Project URL
|
Primary web site:
|
http://www.cs.cmu.edu/~coral/MARS |
| Research web site: |
http://www.cs.cmu.edu/~coral |
Robot Soccer web site
|
http://www.cs.cmu.edu/~robosoccer |
|
Objective
Our DARPA MARS research effort has focused on the problem of robot
autonomy in multi-robot team tasks set in highy dynamic, adversarial
environments against unknown opponents. The research effort has been
broad initiative, covering a range of domains from simulated robot
agents through to fully individually autonomous robots with complex
kinematic structure. Our primary focus has been to to develop complete,
effective, and adaptive software
for autonomous robot teams to enable the team to perform their complex
real-time task while adapting their strategy, behavior, and control
methods to the opponent team and environment. Moreover, our goal has
been to
develop techniques and algorithms that work -- meaning all of our
approaches must be extensively validated in real-world tasks. Our
evaluations have focused on testing the
performance of our techniques within the domain of robot soccer, where
teams of robots compete autonomously in games of soccer. Indeed, over
the course of our research project, we have competed with our various
teams in over 7 International and Regional competitions equating to
over one hundred of games in different domains against teams using
unknown strategies.
Approach
Our approach is characterized by a consistent focus on complete autonomous multi-robot
teams operating in adversarial domains. Complete means that each robot
must continuously perceive, think, and act within the team environment
in real-time, where real-time means at least as fast as the opponents.
This means each component of the robot's decision cycle must perform to
expectations, as well as integrating into a seamless whole in order to
achieve good team performance. The teams are fully autonomous, meaning
that there is no human involved in the decision making cycle during
task execution.
We divide our work into two main components developing individually
skilled team members, and robots that are able to team. In each case,
our work includes significant efforts to apply learning and adaptation
throughout the control hierarchy in a practical and useful way: We
first describe the domains we work with. All of our robot domains are
based upon robot soccer, where teams of autonomous robots compete in a
game of soccer. Under MARS DARPA funding we have investigated a range
of domains including:
Domains
[Sony
AIBO Legged League] The legged league consists of teams of
Sony AIBOs, where each AIBO is
autonomous and can communicate via 802.11b wireless ethernet. Each
robot is a quadruped with 16 degrees of freedom, with a color camera as
its primary perception mechanism. Each robot also has accelerometers to
act as a G-sensor. The robots are unmodified, except for the software,
which is written in C++ using the SDK made available by Sony to provide
the low-level API for controlling the robot. The field is approximately
4m long and 3m wide and is surrounded by four color coded markers and
two colored goals.
[Small-Size Robot League] Each
team consists of five small robots, where the robots are custom built
and must fit within a 18cm diameter cylinder that is at most 15cm tall.
Robots use ball manipulation devices including a kicker, and a special
purpose dribbling device to control the ball. Perception is provided by
overhead cameras and control by an off-field computer. Small-size
robots are therefore autonomous as a team. The ball is an orange golf
ball, and the field is 4.9m long by 3.3m wide with no walls.
[Segway Soccer League] This is
a new domain we have developed under the MARS program. It makes use of
both humans and robots, where all platforms are based on the Segway
mobility platforms. We have proposed a new league to entice other
researchers to participate in this domain. The ball is a standard size
5 soccer ball, and the field scales with the number of players so that
11 vs 11 gives a full size soccer field.
[Simulation Robot and Coach League]
Simulation soccer runs in a 2D or 3D simulator where each simulated
robot is given high level perceptual information at a predefined
frame rate. The game uses dimensions comparable to humans playing on a
soccer field. We also have a special coach competition whereby a
coachable team is provided, and it is up to the coach to learn how to
best provide advice to the team based on prior observation of the team
and opponents.
Technical Approach
All multi-robot systems need team members with sufficient individual
skills to allow the team to achieve its goals. We have focused our
efforts on developing individually skilled robots that are able to
participate in a team and adapt to their environments. Individual
skills can be further broken into perception, cognition, and action,
where learning can be applied to each component.
Perception
All of our robots utilize color vision as their primary sensing
modality. In addition, depending upon the platform in question, the
robots may make use of odometry information, joint angles and
accelerometer (G-sensor) data. As with any sensor, information is
gleaned from the world at a steady, pre-defined rate, but is corrupted
by both systematic bias and noise of both Gaussian and non-Gaussian
forms and suffers from unavoidable latency. The purpose of robot
perception is to extract information from these noisy sensors, fuse
information from different sensors (and perhaps from other robots) to
provide a useful model of the current state of the world in order for
the control hierarchy to make intelligent, informed decisions and act
upon them. Thus, perception consists of raw sensory processing,
followed by tracking and information fusion to obtain a model of the
world state.
[Vision
Based Perception] In prior work under the MARS program, we
developed a fast color vision library called CMVision [Bruce et al.
00]. CMVision is now widely used within robotics and related
communities, and is also used as a benchmark for comparing speeds of
new color vision techniques. CMVision operates in a three step process:
first each pixel is classified into a color class using a lookup table
for speed, second neighboring pixels are collected to form contiguous
regions using run length encoding and conneted component analysis,
finally the regions themselves are analyzed using shape and geometric
constraints to identify apriori known obstacles.
We have used CMVision in a variety of domains including the Sony
AIBO's, which have on-board vision with limited CPU resources, and
small-size which makes use of an overhead camera. In our recent work,
we have extended CMVision to include unified multi-camera detection of
a robot team from an overhead perspective. In particular, we have
investigated techniques for handling control in the regions of overlap
between different camera perspectives. This system was validated at
RoboCup 2004 under ambient lighting conditions.
We have investigated alternative color vision based techniques with our
Segway platforms. We have developed general region growing techniques
based on a homogenity criteria: fixed color distance from the initial
threshold. Following our prior work, color classification uses a lookup
table on the average region color, and high level object detection uses
geometric constraints. We are investigating techniques to enable the
object detection to operate independent of illumination variation
[Browning et al 04, and additional publications forthcoming].
[Multi-Object Tracking] We
have developed robust Kalman filter-based multi-object tracking
techniques. In earlier work under the MARS program we developed
improbability filtering methods for reducing the rate of
false-positives and their resulting detrimental effect on tracking
[Browning et al. 02].
Recently, we have developed Kalman-filter based techniques for object
tracking for robots with on-board perception for both the Sony AIBOs
and Segway platforms [publication in preparation].
[Localization] In prior work we
have developed robust extensions to Monte-Carlo Localization, called
Sensor Resetting Localization (SRL) [Lenser & Veloso, 00]. We have
recently revised and extended this approach to use a particle filter
using a hybrid MCMC update technique, and metropolis sampling, derived
from the physics and particle filter literature. This technique has
been fully implemented on the Sony AIBO platforms and makes use of a
wide range of visual features including field markers, goals, and field
edges.
[Local Model] We have
developed a local obstacle model for navigation derived from region
edges and camera projective geometry. The model enables reactive
navigation by providing a local map of the surrounding terrain that is
updated in real-time in synchronicity with the robot's perception.
Secondly, it enables action selection to be better informed by
providing context on the surrounding environment to the decision making
process. We have validated this technique in our Sony AIBO team at
RoboCup events, as well as in controlled laboratory experiments [Fasola
et al, 04, Lenser et al 03].
Single Robot Cognition and Action
[Skills,
Tactics] We have developed a hierarchical control
architecture called STP: Skills, Tactics, and Plays [Browning et al,
03, forthcoming publication]. Plays deal with team behavior, and are
discussed below. Tactics and skills control single robot behavior. Each
tactic specifies an augmented finite state machine (AFSM) of skills and
an evaluation function. The AFSM defines what skills may be executed,
where transitions between skills in the AFSM are a function of
the state of the world. The evaluation function determines what needs to be done to acheive
the goal of the tactic and specifies the parameters that are passed to
the executing skill. Each skill is a focused control policy, where
focused means that it is only defined over a portion of the state
space. This restriction means that when creating a skill, the designer
need only worry about the behavior of the control policy over the the
relevant states rather than in all possible circumstances. The
hierarchical decomposition that STP provides facilitates easy
development of new behaviors, or modification of pre-existing
behaviors. Secondly, it provides a natural mechanism to seamlessly
integrate learned skills with hand-coded ones. We discuss this more
below.
[Real-time Path Planning] We
have developed a real-time robot path planning technique, based on
Rapidly Exploring Random Trees (RRTs) that plans sufficiently quickly
to be applied to a 5 robot team which fully replans each robot path
every frame at 30Hz [Bruce & Veloso, 03]. In more recent work, we
have investigated a similar approach using probabilistic roadmap
techniques to provide rapid replanning in a dynamic world while
fulfilling an objective function (such as maintaining line of sight to
a target or targets). We are investigating approaches to combine both
techniques to obtain the benefits of both.
Individual Learning
[Learning
Action Outcomes] We have developed techniques to build
probablistic models of the outcomes of discrete actions (in this case
kicks) in order to provide a predictive model to aid in robot decision
making. Prior to task execution, each robot collects data on the
outcome of each of its different actions. This data is used to build a
statistical model of each action. During run-time execution, whenever a
choice of different actions occurs, the robot analyzes the predicted
outcomes and chooses the action that leads to the result with the
highest utility. This work has been fully implemented and tested on the
Sony AIBO platform in RoboCup competitions [Chernova & Veloso, 04].
[Learning Sensor States] Being
able to robustly learn, classify, and recognize 'state context' from a
stream of sensor measurements is one of the key challenges to robot
intelligence. If a robot can recognize the state of the world, and
select the appropriate controller, it can potentially extend the range
of even standard control techniques to handle much more dynamic
situations by switching between controllers appropriately. We have
explored a number of techniques using non-parametric statistical
time-series methods to learn and classify sensor state from a sequence
of continuous, multi-dimensional sensor vectors. These techniques have
been applied to the Sony AIBO platform under various conditions
including variable lighting [Lenser & Veloso, 03], and G-sensor
data when walking under different conditions [Vail & Veloso, 04].
[Motion Optimization] We have
explored an approach to learning optimal motion parameters for a
quadruped robot (a Sony AIBO). Our approach uses genetic algorithms to
learn to optimize the parameters controlling the walk motion on a Sony
AIBO [Chernova & Veloso, 04]. The resulting walk on an ERS-7
reaches speeds of 350mm/s, which is the fastest walk ever observed on a
Sony AIBO. Optimizing walk parameters is a difficult process given the
high dimensionality of the state space (approximately 12-24 dimensions
depending upon which symmetries are enabled), and the amount of noise
in each fitness evaluation.
[Skill Learning] We have developed new techniques based on
function approximation to learn robot skills (in the context of the STP
architecture) from a human teacher. The human teacher provides example
traces of execution demonstrating the desired skill. The robot is then
able to recall, and generalize, from these execution traces to execute
a focused control policy. This technique works in conjunction with hand
coded policies in a robot control architecture in a seamless manner
[Browning, Xu, Veloso 04].
Team Cognition and Learning
[Plays
for Coordination] As
described above, we have developed a Skills, Tactics, Plays
architecture (STP) for teamwork in adversarial environments [Browining
et al 03, Bowling et al 04, additional publications forthcoming]. Plays
are the component
for team interaction. Each play is a fixed team plan that encodes a
sequence of synchronized actions for each role in the team. Roles are
assigned to players dynamically during execution, thereby allowing
flexibility of execution. Plays are selective, in that they have
applicability conditions defined on the world state that determine when
a play can execute. Plays have termination criteria that decide when a
play should stop executing, and also what is the result of the play
upon termination. The latter is used to adapt future play selection as
described below. We have developed a play language, which allows rapid
development of new plays. Indeed, we have shown with our RoboCup
experience that a completely new set of plays ie. a new team strategy,
can be developed within an hour or so. This is significantly faster
than any other mechanism currently present in the community.
[Pickup Teams] We have recently
begun extending our STP architecture to the problem of pickup teams. A pickup team is
where the heterogenous team forms with partial, or no apriori,
information about each team mate. In such a situation the team needs to
rapidly acquire knowledge about each teammate and negotiate what roles
will be assigned on the team. We have taken the first step towards a
pickup team framework by forming a joint team with the RoboDragons,
from Aichi Prefectural University in Japan. Our teams joined, and
competed at RoboCup 2004, with little apriori knowledge of each others
frameworks except the agreed upon interface at the vision level and at
the robot command level. Many issue were raised by this challenge that
we are currently investigating.
[Token based Coordination] We
have investigated a market based scheme combined with a token passing
mechanism to coordinate role assignment between distributed
robots in a highly uncertain world. In particular, we have made use of
our work with shared perception to enable the market based system to
allow the bidding process to operate independently of the network
communications, thus preserving bandwidth on high latency networks.
This technique has been empirically validated on our Sony AIBO robots
[Vail & Veloso 03].
[Coaching] We have developed a coach framework whereby an agent
can observe a team performing a task, and then offer advice to help
improve team performance. The coach generates advice for the team by
observing the performance of the team, and other teams, in the past
from game logs. The coach generates an abstract Markov Decision Process
to describe the task faced by the team [Riley & Veloso 04]. By
solving this Abstract MDP, the coach then provides useful information,
in the form of an abstract policy, that the robots use during execution
to make better decisions. We have validated this approach on many logs
and simulation games obtaining statistically significant results that
demonstrate the usefulness of the coach in improving team performance.
[Shared Perception] In our
prior work under the MARS program we have developed probablistic
techniques for fusing perceptual information from multiple robots into
a shared world model. Moreover, we have investigated techniques for
determining which information, local or shared, a robot should use in
order to act using its best available knowledge [Roth, Vail, Veloso,
03].
[Opponent Modelling] We have
developed techniques for building probabilistic models of an opponent
team's strategy based on observing the history of the team's prior
execution of the task [. Using this probablistic model, we have
developed techniques to enable a coach agent to plan out a series of actions that
best takes advantage of the opponent's weaknesses. The resulting plan
is then monitored, with temporal constraints, to ensure correct
execution and to the trigger the appropriate response in the event of
failed execution. This work has been validated empirically in the
simulation league [Riley & Veloso, 02].
Recent Accomplishments
Following our established practise, we have fully validated our work at
the RoboCup 2004, and RoboCup US Open 2004. Our results have been fully
published and presented to the community, and where possible technology
transition efforts have been made. Here we list our recent
accomplishments:
[RoboCup
2004] We participated at
RoboCup
2004 in Lisbon Portugal. We entered our small-size robot team as a
joint effort with RoboDragons from Aichi Preferctural University in
order to investigate the concept of pickup teams. We entered our Sony
AIBO team CMPack, and our simulation team CMLoki, and our coach team
CMOwl. We also demonstrated our Segway RMP platform in order to promote
the new domain of Segway Soccer to othe researchers.
[RoboCup USOpen 2004] We
participated at the
RoboCup
USOpen 2004, held in New Orleans. We entered with our Sony AIBO
team CMPack, winning the chamionship, and demonstrated our Segway RMP
platform in conjunction with the Neurosciences Institute.
[Skill Optimization and Learning]
We have developed new techniques for optimizing motion execution, and
learning new skills for robot control. We have validated these
techniques, and in the case of motion optimization, lead a community
effort to develop robust optimization techniques applicable to robot
control for quadruped robots.
[Coachable Teams] We have
developed new techniques for coach agents based on learning an Abstract
Markov Decision Process of team execution, and then solving this
Abstract MDP to derive an optimal policy which is used to generate
useful advice for each player in the team.
[Learning Sensor States] We
have developed new classes of techniques for learning state from
continous time-serious sensor signals. We have built upon our previosu
work in this area with new techniques that extend into multiple
dimensions with real-time data.
[Pickup Teams] We have begun
the exploration of pickup teams, where heterogeneous teams form with
partial knowledge of their teammates capabilities and algorithms.We
empirically investigated this challenge with the RoboDragons from Aichi
Prefectural University in Japan at the recent
RoboCup 2004 event.
[DarpaTech] We demonstrated our
techniques at the DarpaTech 2004 Symposium.
[US Army War College] We
demonstrated our Segway platforms at the US Army War College during
May, 2004.
[Segway for education]
Our work has lead to the use of the Segway RMP as the new robot
education platform for Carnegie Mellon's Qatar initiative.
Technology Transition
Our work has lead to technology transitions in a number of situations.
We detail some of these below, and our efforts to promote further
transitions.
[Sony
AIBO software] We have released all of our Sony AIBO software.
This has been used by other researchers to provide known opponents for
them to test their algorithms against. Additionally, it has provided
the base for new research teams to develop their software from (e.g.
Georgia Institute of Technology).
[CMRoboBits] Based on our Sony
AIBO software, we have developed a complete course and simplified
software package for teaching robotics to undergraduates and graduates.
The complete
CMRoboBits
course notes and software are available on-line.
[Segway
for education] Our work has lead to the use of the
Segway RMP as the new robot education platform for Carnegie Mellon's
Qatar initiative.
[CMVision] Our fast color
vision library has been available for some time now. It is widely used
in the robot soccer research community, and in the wider research
community. It is used within the autonomous foozball project (which is
being commercialized), among others. It also provides the benchmark
standard for comparing similar algorithms in terms of performance and
speed.
Technology transition efforts:
[Software]
All of our
software
is regularly released to the community and is available on-line.
[Publications]
All
publications
from our group are available on-line for perusal.
[Multi-media] All
movies documenting
our research are available on-line. This promotes interest in our work
from abroad leading to greater exposure with the community.
[Robot hardware] All small-size
robot hardware is available on-line.
Relevent Publications
An Evolutionary Approach To Gait Learning For Four-Legged Robots,
Sonia Chernova and Manuela Veloso.
In
Proceedings of IROS'04, Sendai, Japan, September 2004.
State
Identification From Robot Sensors Using Non-Parametric
Statistics,
Scott Lenser and Manuela Veloso.
In
Proceedings of IROS'04, Sendai, Japan, September 2004.
Turning
Segways into Soccer Robots,
Jeremy Searock, Brett Browning, and Manuela Veloso.
In
Proceedings of IROS'04, Sendai, Japan, September 2004.
Skill
Acquisition and Use for a Dynamically-Balancing Soccer Robot,
Brett Browning, Ling Xu, and Manuela Veloso.
In
Proceedings of the Nineteenth National Conference on Artificial
Intelligence (AAAI-04), San Jose, July 2004.
Advice
Generation from Observed Execution: Abstract Markov Decision
Process Learning,
Patrick Riley and Manuela Veloso.
In
Proceedings of the Nineteenth National Conference on Artificial
Intelligence (AAAI-04), San Jose, July 2004.
Accurate
and flexible simulation for dynamic, vision-centric robots,
Jared Go, Brett Browning, and Manuela Veloso.
In
Proceedings of The Third International Joint Conference on
Autonomous Agents and Multi Agent Systems (AAMAS04), New York,
July 2004.
Learning
from accelerometer data on a legged robot,
Douglas Vail and Manuela Veloso.
In
Proceedings of the 5th IFAC/EURON Symposium on Intelligent
Autonomous Vehicles (IAV2004), Lisbon, Portugal, July 2004.
CommLang:
Communication for Coachable Agents,
John Davin, Patrick Riley, and Manuela Veloso.
In
Proceedings of the RoboCup International Symposium,
Lisbon, Portugal, July 2004.
Segway
CM-RMP Robot Soccer Player
Jeremy Searock, Brett Browning, and Manuela Veloso.
In
Proceedings of the RoboCup International Symposium,
Lisbon, Portugal, July 2004.
Plays
as team plans for coordination and adaptation,
Michael Bowling, Brett Browning, and Manuela Veloso.
In
Proceedings of the 14th International Conference on Automated
Planning and Scheduling (ICAPS-04), Vancouver, June 2004.
Development
of a soccer-playing dynamically-balancing mobile robot,
Brett Browning, Paul Rybski, Jeremy Searock, and Manuela Veloso.
In
Proceedings of ICRA-2004, New Orleans, May 2004.
Learning
and using models of kicking motions for legged robots,
Sonia Chernova and Manuela Veloso.
In
Proceedings of ICRA-2004, New Orleans, May 2004.
CMRoboBits:
Creating and Intelligent AIBO Robot,
Manuela Veloso, Scott Lenser, Douglas Vail, Paul E. Rybski, Nick
Aiwazian, and Sonia Chernova.
In
Proceedings of the AAAI Spring Symposium on Accessible Hands-on
Artificial Intelligence and Robotics Education, Stanford, March
2004.
Relevent Video Footage
A number of videos demonstrating the research we have conducted under
the MARS program are available on-line. These videos were taken from
various RoboCup competitions, various demonstrations, as well as within
our laboratories under experimental conditions. We have categorized
these videos based on the specific domain. Unless specifically noted,
all videos show robots operating autonomously:
Manuela
Veloso and
Manuela Veloso
Last modified: Thu Aug 1 2004