Project Objectives
Develop complete, effective and scalable software for autonomous robot teams. Demonstrate robot teams with integrated perception, reasoning, learning, communication and cooperative strategies that solve complex multi-agent tasks.
Project Website
Approach
Under the MARS program, we continue to focus our efforts on meeting the key challenges for building individually skilled autonomous robots and successful multi-robot teams to operate in uncertain, adversarial domains. These challenges range from single robot issues of localization, real-time visual sensing, robust tracking, high-speed navigation control, and automatic segmentation and recognition of environmental regimes through to multi-robot team issues of adaptive team strategy, robust cooperation through communication in the face of uncertainty, and team learning. All of these issues are important to the successful operation of multi-robot teams.
Over the last year, we have competed at the RoboCup-2002 world robot soccer championships with great success by winning first prize in the legged league competition and reaching the quarter finals in the small-size league. We then demonstrated our research during the DarpaTech Symposium held Los Angeles. Since then, our work has focused primarily on consolidating, documenting, and extending our research efforts. As a result, much of our work has now been published, or is in the process of being published, a vast majority of our software is now on-line and publicly accessable to the research community through our web pages. In particular, our recent efforts have focused on:
Multi-robot Learning
In our previous work (see http://www.cs.cmu.edu/~multirobotlab/MARS)
we have developed empirically and rationally convergent multi-agent
learning techniques based on the theory of stochastic games, Markov
Decision Processes, and reinforcement learning. We have previously
reported on the WoLF algorithm (short for Win or Learn Fast) that we
have successfully applied to stochastic games with very large,
although still discrete, state spaces. Our recent work and on-going
work looks at a significantly more challenging problem: applying these
multi-agent learning concepts to real robot systems with continuous
state and action spaces. We are happy to report that we have
succesfully applied WoLF to an adversarial zero-sum multi-robot
learning task we call "Shutout". The task requires an attacker robot
to attempt drive to the center of a circle while the defender robot
attempts to screen it away (collision avoidance is used). Using our
WoLF algorithm in combination with on-policy SARSA using tile-coding
function approximation techniqes, our system is able to learn in both
simulation and on the real robots within a reasonable time frame of a
few hundred trials useful strategies for attacking and defending,
respectively. (paper forthcoming).
Adaptive Team Strategy
Our work with multi-agent
systems demonstrates that in adversarial tasks at least, static team
strategies can invariably be dominated by ones that adapt over time.
To be most effective, team strategy must meld the individual skills
available to each member of a team to produce a strategy that best
responds to the actions of the opponent. During the last year, we have
developed a strategy engine that adapts team behavior in real-time to
the actions of the team without requiring knowledge, either apriori or
otherwise, of the opponent's strategy.
Our strategy system is built around the notion of a play, which encodes a sequenced team plan. A play encodes a sequence of behaviors for each robot to perform along with applicability, termination, and dynamic role assigmnet conditions. Applicability conditions enable specialized plays, ie. ones that only apply in very special circumstances, which effectively simplify the range of the joint state space that the resulting behavior sequence needs to be defined over. Role assignment is performed on a prioritized basis using behavior-specific evaluation methods. Thus, the system takes advantage of the best robot to use for a given role rather than using some static pre-ordained assignment. Termination conditions, which include timeouts, ensure plays have well defined criteria for stopping and transitioning to a another play and are useful for adapting play selection. Plays are written in a text-based, easy to understand language enabling easy extension or construction of different play scenarios.
The most interesting innovation, has been the use of adaptation to automatically adjust play selection to focus on the dominant team strategies. Concretely, applicable plays in a given situation are selected at random using an assigned weight which defines the probability of selecting the play. Our algorithm adjusts this weight over time based on the success, or failure, of each play (using the termination conditions mentioned previously). We empirically validated this approach at RoboCup 2002 successfully, and have recently explored the capabilities of this approach using more controlled, in-lab experiments (see the recent publications, and in submission).
Teamwork communication and collaboration
Since the recent
availability of wavelan to support wireless communication for our Sony
AIBOs, we have been investigating algorithms and representations for
collaborative behavior between autonomous legged robots. This work
represents a unification and extension of our prior work with the Sony
AIBO's and our minnow team for distributed cooperation via
communication. (paper under submission).
We have developed algorithms for building a global world model that is jointly maintained across the team. Each robot shares with the team their own high-level local perception output along with the associated measures of uncertainty. Building on our prior work in distributed sensing, our algorithms effectively fuse the information reliably while handling uncertainty and conflict resolution in a principled manner. The resulting global map enables us to estimate the confidence of our model, to predict future states of the world, and extend each robot's effective perception range beyond the capabilities of any single robot. (paper under submission).
Based on the fused global model, we have build team coordination algorithms. Our robots position themselves by a gradient ascent function in a field of constraints and objectives; they spread out on the field assuming roles of attacker, supporter, and defender in positions that are suitable to carry out tasks in line with that role. (paper under submission).
Localization
In our previous work under the MARS
program, we developed Sensor Resetting Localization (SRL), an
extension to Monte-Carlo probablistic localization techniques. Because
effective robots may be small, they can be moved by external sources
beyond the robot's own control. The SRL algorithm that we devised
allows robots to use their sensor input to rapidly adapt to changes in
their position. The robots effectively localize in real-time with an
accuracy of 2-5 cm. Although effective, SRL still requires heavy
computational resources given the position belief computation for its
large number of sample points. (see publications).
Continuing our commitment to effective real-time localization, we have investigated new localization algorithms that attempt to address the multi-sample problem. We developed a localization algorithm, Constraint Based Localization (CBL), which maintains a single belief of the robot's location updated by its sensor input and movement. CBL is computationally more efficient than SRL and is more robust through its use of multiple landmarks. Our latest algorithms have been implemented on different robot platforms. (Various publications, see list below)
High-speed navigation and control
High-speed
navigation is a poorly explored area of robotics research and yet
there are many robotics problems that require high-speed navigation
for them to be effective. To date, it is a commonly accepted belief
that high-speed navigation can only be addressed by a combination of
low-level reactive navigation algorithms driven by high-level planning
algorithms. Our recent work has proven this belief to be incorrect.
We have extended the Rapidly-Exploring Random Trees (RRT) planning algorithm and applied it to controlling our team of five small-size robots (average computation time is 2ms per robot) moving at speeds of up to 1.7m/s through cluttered, moving obstacles fields. The algorithm works by progressively building a search tree through a combination of random exploration and biased motion towards the goal state. Our extensions to the basic RRT algorithm include:
Preliminary comparisons against conventional reactive navigation schemes indicate that it can control the robot at significantly higher speeds without requiring significant computational resources. (IROS'02 publication, see list below)
Generalized vision-based obstacle avoidance
We have
developed a new vision processing algorithm, built upon our CMVision
library (see http://www.cs.cmu.edu/~jbruce/cmvision),
to perform general (ie. any shape or color) obstacle avoidance. The
algorithm operates on our Sony AIBO's at the full frame-rate of 25Hz
on a 200MHz MIPs processor. The algorithm, which we call visual
sonar, calculates the radial distances to objects around the
robot regardless of the robot's head or body orientation. (paper
forthcoming)
Real-time visual sensing and tracking
In order for our
robots to react sensibly to changes in the real world, they must be
able to quickly sense objects in their environment. We have extended
our previously developed color vision library, called CMVision (see http://www.cs.cmu.edu/~jbruce/cmvision
for details) to process images at the full available frame rate with
low CPU usage. Moreover, CMVision is cross platform and operates on
our Sony AIBO's as well. Additionally, we have developed new
techniques to improve the robustness of multi-agent tracking in the
face of false-positive recognitions. This technique has applications
in systems where false-positives must be dealt with using a minimal
amount of computational resources. (ICRA'02 publication, see list
below)
Automatic segmentation and recognition of environmental
regimes
Robot environments are typically quasi-stationary
systems. Being able to identify and recognize the different
environmental regimes or contexts offers the ability to extend the
use and dynamic range of many current algorithms significantly. We
are developing algorithms to automatically segment and recognize
different sensory inputs using on-line, non-parametric statistical
tests. Our current work focuses on learning to distinguish lighting
level regimes in an office environment on a Sony Aibo. Once the
lighting level is recognized, the appropriate thresholds for color
image segmentation can be chosen thereby increasing the dynamic range
of the color vision routines used by the robot. (paper under submission).
Our specific accomplishments during the last reporting period are:
Plan
Technology Transition
The following robot designs and software funded under the MARS program have been developed by us and released to the community:
Relevant Publications
Brett Browning and Erick Tryzelaar. UberSim: A Realistic Simulation Engine for Robot Soccer. In Proceedings of Autonomous Agents and Multi-Agent Systems, AAMAS'03, 2003.
James Bruce and Manuela Veloso. Fast and Accurate Vision-Based Pattern Detection and Identification. In Proceedings of ICRA'03, the 2003 IEEE International Conference on Robotics and Automation, Taiwan, May 2003. under submission.
Paul Carpenter, Patrick Riley, Manuela Veloso, and Gal Kaminka. Integration of Advice in an Action-Selection Architecture. In Gal A. Kaminka, Pedro U. Lima, and Raul Rojas, editors, RoboCup-2002: The Fifth RoboCup Competitions and Conferences, Springer Verlag, Berlin, 2003. (to appear).
Scott Lenser and Manuela Veloso. Automatic Detection and Response to Environmental Change. In Proceedings of ICRA'03, the 2003 IEEE International Conference on Robotics and Automation, Taiwan, May 2003. under submission.
Patrick Riley. MPADES: Middleware for Parallel Agent Discrete Event Simulation. In Gal A. Kaminka, Pedro U. Lima, and Raul Rojas, editors, RoboCup-2002: The Fifth RoboCup Competitions and Conferences, Springer Verlag, Berlin, 2003. (to appear) Won a RoboCup Engineering Award.
Maayan Roth, Douglas Vail, and Manuela Veloso. A World Model for Multi-Robot Teams with Communication. In Proceedings of ICRA'03, the 2003 IEEE International Conference on Robotics and Automation, Taiwan, May 2003. under submission.
Douglas Vail and Manuela Veloso. Multi-Robot Dynamic Role Assignment and Coordination Through Shared Potential Fields. In Proceedings of ICRA'03, the 2003 IEEE International Conference on Robotics and Automation, Taiwan, May 2003. under submission.
James Bruce, Michael Bowling, Brett Browning, and Manuela Veloso. Multi-Robot Team Response to a Multi-Robot Opponent Team. In Proceedings of IROS'2002 International Conference on Robotics and Autonomous Systems, Workshop on Collaborative Robots, Switzerland, 2002.
Multiagent learning using a variable learning rate, Michael Bowling and Manuela Veloso. Artificial Intelligence 136:215-250, 2002.
Real-time randomized path planning for robot navigation, James Bruce and Manuela Veloso. In Proceedings of IROS-2002, Switzerland, October 2002. An earlier version of this paper appears in the Proceedings of the RoboCup-2002 Symposium.
Scalable learning in stochastic games, Michael Bowling and Manuela Veloso. In Proceedings of the AAAI-2002 Workshop on Game Theoretic and Decision Theoretic Agents, Edmonton, Canada, August 2002.
Improbability Filtering for Rejecting False Positives, Brett Browning, Michael Bowling, and Manuela Veloso. In Proceedings of ICRA-02, the 2002 IEEE International Conference on Robotics and Automation, Washington, DC, May 2002.
Fast Parametric Transitions for Smooth Quadrupedal Motion, James Bruce, Scott Lenser, and Manuela Veloso. In A. Birk, S. Coradeschi, and S. Tadokoro, editors, RoboCup-2001: The Fifth RoboCup Competitions and Conferences. Springer Verlag, Berlin, 2002.
A modular hierarchical behavior-based architecture, Scott Lenser, James Bruce, and Manuela Veloso. In A. Birk, S. Coradeschi, and S. Tadokoro, editors, RoboCup-2001: The Fifth RoboCup Competitions and Conferences. Springer Verlag, Berlin, 2002.
CM-Pack'01: Fast Legged Robot Walking, Robust Localization, and Team Behaviors William Uther, Scott Lenser, James Bruce, Martin Hock, and Manuela Veloso. In A. Birk, S. Coradeschi, and S. Tadokoro, editors, RoboCup-2001: The Fifth RoboCup Competitions and Conferences. Springer Verlag, Berlin, 2002.
Convergence of gradient dynamics with a variable learning rate, Michael Bowling and Manuela Veloso. In Proceedings of ICML-2001, pages 27--34, Williams College, MA, June 2001.