Team-aware Robotic Demining Agents for Military Simulation

Gita Sukthankar and Katia Sycara
Robotics Institute
Carnegie Mellon University
Pittsburgh, PA 15213


In this paper, we describe a robotic demining system developed as part of an multiagent application (AgentStorm) for assisting human commanders in command and control scenarios run in the ModSAF (Modular Semi-Automated Forces) simulation environment. The robotic demining agents cooperatively clear paths, enabling simulated forces to breach minefields. The AgentStorm system is composed of 25+ communicating software components developed with the RETSINA agent architecture; it was successfully demonstrated for military observers in November 1999. Within the demining domain, we explore different multirobot cooperation and communication strategies. Previous work suggests that a homogeneous strategy should perform well in this domain, given its similarity to other foraging-type tasks. However we show that robots using a simple homogeneous approach have difficulty completing the task due to high inter-robot interference. We suggest that the proper way to approach the problem of interference is to make the the robots more ``team-aware''; each robot plans its strategy based on assumptions about what its teammates are doing and current sensor data. This approach works well and substantially outperforms the simple homogeneous approach. Although this type of global optimization can become intractable for large numbers of robots, we present an efficient algorithm that only considers local interactions.


Although real-world solutions for robotic demining have not as yet been demonstrated, multirobot systems will clearly yield significant benefits over a single robot, given the extremely hazardous nature of the domain [JP1999]. Having multiple robots adds a desirable redundancy to the system, as well as potentially reducing the time required to clear an area if the robots are deployed effectively. In this paper, we describe a software implementation of a multirobot demining system that was deployed as part of a larger multiagent system, AgentStorm, to assist human commanders in command and control scenarios. Simulated robotic demining agents coordinate to breach a path through a minefield; demining is modeled as a distributed optimization problem in which the robots strive to minimize an abstract cost function. Although researchers have drawn comparisons between robot demining and foraging [Goldberg & Mataric1997], path breaching cannot be modeled as a simple foraging task because the goal of the exercise is not to optimize on number of mines removed but to assure coverage of a connected path. An overeager forager would seek the section of the minefield with the thickest mine density, and hence the most potential reward, rather than finding paths that avoid mined areas.

Real-World Demining Applications

To date, most of the demining research effort has been applied towards humanitarian demining, the process of systematically clearing unused minefields without the hazard or time pressure of being in a combat zone [Havlík & Licko1998]. Since these minefields are often located near civilian homes or agricultural areas, the usual military expedient of detonating the mines with explosives or plows is undesirable. Another related problem is minefield mapping, which is very similar to many cooperative mapping applications in which the robots must consider tradeoffs between effectively covering an area using redundant scans by multiple robots or spreading out to cover area as quickly possible. In this paper, we explore the problem of minefield breaching, which is often done under combat conditions; the goal of breaching is to clear a path through the minefield large enough to move troops through [DeTeC1997] Usually breaching is performed when speed is of the essence and there is insufficient time to clear the entire field before an upcoming engagement.

Application: AgentStorm

AgentStorm is a multiagent system designed to handle tank platoon movement at a tactical level in military command and control scenarios. Using the AgentStorm system, commanders can plan and execute joint operations in the ModSAF simulation environment [Jones1999]. AgentStorm accepts input through a speech recognition interface agent (incorporating the CMU Sphinx recognizer [Placeway et al. 1997]) that captures salient features from a human mission briefing; each commander involved in the scenario interacts with a mission agent that plans the platoon's movement based on information from the briefing and other supporting agents. Route planning in AgentStorm accounts for factors such as terrain/vehicle interactions, the desirability of having commanders use mutually reinforcing routes, and negative constraints such as minefields. In the event of unforeseen occurrences, the mission agents replan using the most current information. AgentStorm was successfully demonstrated for military observers in November 1999 (CMU ONR MURI Demo). The robotic demining agents and supporting software described in this paper were designed as AgentStorm components but can also function in a stand-alone mode.

Figure: A partial layout of the AgentStorm multiagent system. Tactical tank platoon movement is visualized using the ModSAF simulator; minefield breaching is modeled using TeamBots and additional mine-related sensors, robots, and obstacles developed for AgentStorm. Communication and services are provided by the RETSINA infrastructure. In this example, four agents coordinate to help a human tank commander find a suitable route to reach his rendez-vous point. Commander A's mission agent sends constraints derived from his briefing to the route planning agent (RPA), which calculates the best route based on terrain and vehicle type. The route returned is not suitable for the mission agent's purposes so the mission agent acts to remove the route constraint by asking the Robot Manager to clear the minefield constraining the route and impeding the progress of the tanks.

Software module: Robot Manager

The Robot Manager acts as a bridge between the RETSINA [Sycara1998] software agents and the TeamBots robotic agents handling tasks such as:

  1. deploying the robots;
  2. maintaining a global map of the minefield;
  3. handling monitoring queries from RETSINA agents about the state of the minefield;
  4. handling single shot queries about the state of the minefield and the robots.
The Robot Manager monitors the progress of the demining robot simulation and reports to all subscribers (usually Mission Agents) when a path has been successfully cleared. Although the Robot Manager governs communications outside the TeamBots simulation, the coordination is decentralized and takes place at the individual robot control system level, rather than being handled by a central planner.

Figure: The Robot Manager displays the merged record of all robotic exploration: mined areas are displayed in red; unknown regions in white; safe areas are blue, and defused mines are marked in green. Additionally the Robot Manager relays commands from RETSINA agents to the robots and can act as an intermediary for inter-robot communication.

Problem Description

The demining robots are responsible for clearing a direct vertical path between the top and bottom edges of the minefield. At each time step, robots must select one of three possible actions:

the robot moves towards a target cell, aiming to perform either a scan or a defuse action on arrival. A behavior-based robot controller plans the path, and obstacle avoidance is achieved through a swirling behavior [Arkin & Balch1997].
the (stationary) robot locates mines concealed in the current cell. Scanning is not instantaneous, and the difficulty is abstractly modeled as a time cost. The robot may elect to abort its scan based upon communication with its peers, in which case no information about the current cell is obtained.
the (stationary) robot removes the mines in the current cell. Defusing is typically more time-consuming than scanning. As for scan, an aborted defuse action is assigned no partial credit. Thus, robots must weigh their desire to complete the current task with the goal of helping their peers.

By varying the costs associated with each action, we can model the characteristics of different robotic deminers. For instance, when the cost of scanning is high, strategies that involve a thorough exploration of the minefield become less feasible. Conversely, when defusing is expensive, robots are encouraged to explore their environment before focusing on a common path containing few mines. The risk of accidental mine detonation, leading to robot loss, can also be represented in the simulation but is not further discussed in this paper. We assume that low-level robot sensing and navigational issues are handled robustly and focus on the high-level problems of cooperation, communication, and task allocation.

The robotic demining problem shares characteristics with the consume and graze tasks described in [Balch & Arkin1995]. Robots individually explore unknown areas to identify low-cost paths (paths with fewer cells to be defused), before focusing on the most promising candidates and ``consuming'' the mines.

Implementation Details

The AgentStorm system was built using the RETSINA agent architecture, which provides communication infrastructure and services useful for distributed systems. agents were developed in a number of programming languages (Java, C++, and Perl) and deployed on several operating systems (Windows 95/98, Linux, Solaris, PalmOS). all agents communicate using the KQML [Labrou & Finin1997] ACL and register with the RETSINA agent name server.

Simulation Environment


The ModSAF military simulator is an interactive, high resolution, entity level simulation that represents combined arms tactical operations up to the battalion level; it is commonly used as an aid in military training exercises [STRICOM1999]. ModSAF simulates an extensive list of entities including fixed and rotary wing aircraft, ground vehicles, dismounted infantry, and additional special models such as howitzers, mortars, minefields and environmental effects. simulated entities can behave autonomously, that is they can move, fire, sense, communicate, and react without operator intervention [Jones1999].


For modeling the activity of the demining robots, we chose the TeamBots simulator for robotic agents [Balch & Ram1998]. TeamBots is a Java simulator that has been used successfully in domains such as robotic soccer and foraging tasks. In addition to the simulation environment, TeamBots includes hardware interfaces and classes for driving popular small robots such as the Nomad 150. Using TeamBots one can define simulated sensors and actuators to suit any real robot system. Although the demining robots could have been represented as specially designed entities within the ModSAF simulation, simulating in TeamBots was more effective because: (1) the other agents were using ModSAF with a distance scale appropriate for platoon level maneuvers; (2) the same robot control systems can be tested in the TeamBots simulation and in hardware without any modifications. TeamBots is a kinematically realistic simulator, capable of modeling sensor noise and positional uncertainty if the simulated sensors are defined appropriately.

Robotic demining agents

Demining robots are modeled as a subclass of the Nomad 150 foraging robot in the Java based TeamBots environment. All robots are assumed to have the following capabilities:

positional encoders:
gives robot position in absolute world coordinates;
kin sensor:
detects other robots within a certain range;
obstacle sensor:
detects obstacles within a certain range;
mine sensor:
detects mines within a specified local area.

Sensor error is not explicitly modeled by the simulation environment, and mines do not impede a robot's movement. Currently the only obstacles in the simulated environment are other robots.


Each robot translates its sensor readings into a grid-based coordinate system for mapping the minefield. Cells are marked as being unexplored, safe, mined, or defused, and each robot maintains its own version of this map.


Inter-robot communication is handled through a socket-based communication server (RoboComm) external to the simulation environment. Through the server, robots can broadcast or unicast serialized Java objects to each other. The communication medium is identical to that used by real robots, and failures occasionally occur if the server is accidentally shutdown or buffers overflow. At periodic intervals, robots broadcast their internal maps to the other robots and to the Robot Manager; communicated maps are merged with information derived from personal exploration (internal maps) to form more comprehensive world maps.

Control system

On every time step, robots identify target positions based on their current knowledge about the minefield. Once the robot arrives at its target location, it proceeds to either scan the cells for mines, or to defuse previously marked mines in its cell. Obstacle avoidance during path execution is implemented as a swirling behavior [Arkin & Balch1997].

Software Modules

The AgentStorm system is composed of 25+ communicating software entities, designed to assist the human commanders in playing wargame scenarios in the ModSAF simulation environment. The software modules can be grouped into four basic categories:

RETSINA architecture

The RETSINA [Sycara1998] architecture includes a group of services or ``middle agents'' useful for developers of distributed systems. These resources include: Agent Name Server, Matchmaker, DemoDisplay, Logger, and Launcher. Agent lookups can be performed using the Agent Name Server, which maintains a simple address listing (machine and port) for all agents, and the more sophisticated Matchmaker, which allows agents to perform lookups based on services, rather than name, using the LARKS advertisement language. The DemoDisplay is a 2-D visualization system for dynamically showing interactions between agents, and the logger keeps records of agent activity. The Agent Launcher is a specialized set of startup scripts for launching agents in the correct order, based on agent dependencies prespecified by the programmer in configuration files.

Interface agents

Humans interact with the AgentStorm system through a small group of agents with specialized user interfaces: speech recognition agents and PalmSAF. Commanders use the speech recognition agent to brief their mission agents with instructions for the scenario. The voice recognition agent uses the Sphinx voice recognition system [Placeway et al. 1997] to submit questions to Nacodae [Breslow & Aha1997], the Navy's Case Based Reasoning system. PalmSAF is an interface to the route planning system developed for the PalmPilot(TM) PDA, suitable for use by a foot soldier.

ModSAF related agents

Specialized software modules were developed to allow agents to interact with the ModSAF simulation system. A suite of CGI scripts was created to allow remote manipulation of ModSAF entities and a ModSAF extension was developed for exporting screen shots as ModSAF PostScript files. The screen shots were analyzed by a visual recognition agent which provided the mission agents with current platoon position information.

Specialized Task Agents

Much of the work of AgentStorm is performed by specialized task agents such as the Mission Agent and the Route Planner Agents (RPAs). The Mission Agent uses the RETSINA hierarchical task network planner with some modifications to accommodate team planning, using the Joint Persistent Goal teamwork model [Cohen & Levesque1997]. Routes are planned using the Route Planner Agents, which are capable of providing least cost paths for different cost metrics (e.g., time, fuel) according to specified positive and negative constraints.

Cooperative Multirobot Path Clearing

When designing the multirobot demining system, we considered issues such as:

In this paper, we describe the results of testing two different homogeneous strategies and four communication protocols.

Individual optimization strategy

To efficiently clear the minefield, each robot independently estimates the expected cost of clearing all possible vertical paths and selects the path with the lowest cost. These estimates are based on the robot's current belief about the world, acquired either through sensor readings, or through communication with its peers. Since robots are aware of the relative time cost of all actions, the cheapest path can be expressed as:




To calculate p, the probability of finding a mine in an unknown square, robots are initialized with a prior belief of mine frequency expressed as a pseudocount, that is updated during the task to more accurately reflect the results of exploration:



The system currently assumes that mines are uniformly distributed over the minefield; a single parameter is sufficient to model such a distribution. In many real minefields, where the mines are scattered by truck or plane, the distribution can be better modeled as a mixture of Gaussians. Our model is currently being extended to incorporate locally weighted regression [Atkeson, Moore, & Schaal1997] to better exploit this domain knowledge.

Figure: Using a simple homogeneous strategy, the resulting task partition is inefficient and results in high interference between robots as different robots attempt to demine the same square. Inter-robot interference often prevents the robots from successfully clearing a path. This run was performed with the following simulation parameters: robots=3, mines=100, full broadcast protocol, defuse=1000, scan=10, move=1

Team optimization strategy

For the team optimization strategy, each robot models the decision-making process of its nearest peers. Since robots outside this neighborhood are less likely to interfere the robot's goal, their impact on the choice of next action is not considered. For each teammate within this neighborhood the robot determines whether the teammate is also heading for its chosen column. This calculation can be performed in O(K) time for K neighbors and creates a list of potential interference points. Each robot attempts to find an optimal assignment of peers to the cells in chosen column, based solely upon its beliefs of their internal states. The obvious method for doing this is to exhaustively enumerate all possible permutations of robots and grid cell assignments:


Clearly this strategy scales poorly with the number of column cells. Fortunately the same result can be calculated far more efficiently. A robot need only consider it's K best choices rather than all N cells, because it can never be assigned to a worse option. This enables us to determine the optimal solution while only considering tex2html_wrap_inline349 potential assignments.

Figure: By making the robots ``team aware'', the resulting task partition between robots is more efficient, since robots select non-conflicting target destinations. This run was performed with the following simulation parameters: robots=3, mines=100, full broadcast protocol, defuse=1000, scan=10, move=1

Communication strategies

In the experiments described in this paper, the robots used one of four communication strategies:

In the AgentStorm application, all communication is relayed through the Robot Manager.


Here we present the results of two groups of experiments:

Simulations were initialized with all robots placed at the bottom of the field at the largest possible dispersion to minimize interference. Since the simulation was completely deterministic, only one run of each experimental condition was performed. The following parameters were used:


Preliminary results show no significant time differences between the communication strategies. Using the simple homogeneous strategy, robots are unable to coordinate effectively to focus on a single path. From direct observation, it appears that in the full broadcast mode, robots converge on one path but do not cooperate to defuse mines in parallel. Often while one robot is defusing a mine, other robots are heading towards the same square. In the worst case, the robots interfere with each other and prevent the task from being completed; with no communication, interference is not a problem and the robots always complete the task.

Table: In the first experiment, different communication protocols are compared using three robots and the simple homogeneous strategy. The evaluation metric used is the number of simulated time steps required to clear a path. All results given are for the 3 robot condition.

One possible way to reduce interference is to use the `team-aware' homogeneous strategy, as described earlier. In the second experiment, we compare the performance of both strategies. Unsurprisingly, the `team-aware' strategy yields the greatest benefit in the scenario of maximum interference.

Figure: In the second experiment, the simple homogeneous approach is tested vs. the more computationally intensive `team-aware' strategy with 1-4 robots. The `team-aware' strategy yields the most benefit in the four robot condition, where the simple homogeneous robots suffer from the maximum interference.


Although multirobot systems are often cited as being intrinsically more robust than a single robot by virtue of redundancy, fault tolerance is not a natural byproduct of duplication but only emerges through careful design. Using the simple homogeneous algorithm, multiple robots often dismally fail to clear a path, whereas a single robot consistently succeeds. Unfortunately, a multirobot system cannot always be created by cloning a group of single robots programmed for the same task. There has to be some awareness, either on the part of the robots or the system designer, of the role that other team members will play in completing the task. Unless the global task is somehow partitioned among the robots, they will either interfere with each other or converge on a sub-optimal division of labor.

Heterogeneity, either behavioral or functional, can serve as prior agreement on labor division. Functional heterogeneity usually means that not all the robots are capable of performing all the tasks, whereas behavioral heterogeneity occurs when not all the robots are interested in performing the same section of the task, at least not at the same time. This inhibits SPST (same place, same time) interference, as described in [Goldberg & Mataric1997].

From a system designer's point of view, the simplest form of heterogeneity to introduce into a multirobot system is to have robots include their ID numbers in the decision making process. The use of ID numbers can facilitate the conflict resolution problem immensely and reduce SPST interference since robots can be assigned temporal priorities based on ID numbers (pack strategy) or restricted to separate spatial regions (territories) [Fontan & Mataric1998]. Multirobot systems can also possess a type of pseudo-heterogeneity--due to variance in starting condition (e.g., robot placement) each robot will have different preferences about task division. Temporal communication lags can also create enough robot dissimilarities to prevent clumping and interference (see Figure 3).

Generally, most multirobot systems have implemented one of the following strategies for handling task allocation:

central planner:
a centralized decision maker that assigns tasks to robots.
robots can negotiate to claim tasks. In the simplest protocol, ``first-come, first-serve'', robots can claim tasks with a mutex that prevents other robots from attempting to perform the same task.
due to behavioral or functional dissimilarity, robots have preferences for different sections of the task.
due to situational dissimilarity, robots gravitate towards different tasks.
robots maintain mental models of what their teammates are doing and plan their own strategy accordingly

Homogeneous systems are well suited for using teamwork since it's much easier for a a homogeneous system to ``model'' its fellow teammates. This approach shares some similarities with the central planner method, although it is often computationally faster since each robot only has to consider the actions of a subset of its peers, rather than the actions of the entire team.


In this paper, we have addressed the following issues in cooperative multirobot systems:

This research was integrated into the framework of a large-scale real-world military simulation system.

Our future work goals include:


We thank Rahul Sukthankar, Matt Mullin, and the RETSINA research group.


Arkin & Balch1997
Arkin, R., and Balch, T. 1997. AuRA: Principles and practice in review. Journal of Experimental and Theoretical Artificial Intelligence 9.

Atkeson, Moore, & Schaal1997
Atkeson, C.; Moore, W.; and Schaal, S. 1997. Locally weighted learning. AI Review 11.

Balch & Arkin1995
Balch, T., and Arkin, R. 1995. Communication in reactive multiagent robotic systems. Autonomous Robots 1(1).

Balch & Ram1998
Balch, T., and Ram, A. 1998. Integrating robotics research with JavaBots. In Working notes of the AAAI 1998 Spring Symposium.

Breslow & Aha1997
Breslow, L., and Aha, D. 1997. NaCoDAE: Navy conversational decision aids environment. Technical Report AIC-97-018, Naval Research Laboratory, Navy Center for Applied Reserch in Artificial Intelligence.

Cohen & Levesque1997
Cohen, R., and Levesque, H. 1997. Confirmation and joint action. In Proceedings of the IJCAI.

DeTeC. 1997. Demining robots and vehicles: Military breaching and mechanical mine clearance. <>.

Fontan & Mataric1998
Fontan, M., and Mataric, M. 1998. Territorial multi-robot task division. IEEE Transactions on Robotics and Automation 15(5).

Goldberg & Mataric1997
Goldberg, D., and Mataric, M. 1997. Interference as a tool for designing and evaluating multi-robot controllers. In Proceedings of AAAI.

Havlík & Licko1998
Havlík, S., and Licko, P. 1998. Humanitarian demining: The challenge for robotic research. The Journal of Humanitarian Demining 2(2). <>.

Jones, T. 1999. ModSAF: Modular semi-automated forces. <>.

JP. 1999. Joint doctrine for barriers, obstacles, and mine warfare. <>.

Labrou & Finin1997
Labrou, Y., and Finin, T. 1997. A proposal for a new KQML specification. Technical Report TR CS-97-03, University of Maryland Baltimore County.

Placeway et al. 1997
Placeway, P.; Chen, S.; Eskenazi, M.; Jain, U.; Parikh, V.; Raj, B.; Ravishankar, M.; Rosenfeld, R.; Seymore, K.; Siegler, M.; Stern, R.; and Thayer, E. 1997. The 1996 HUB-4 SPHINX-3 system. In Proceedings of the ARPA Speech Recognition Workshop.

STRICOM. 1999. Directorate for research and engineering. <>.

Sycara, K. 1998. Multiagent systems. AAAI AI Magazine 19(2).

About this document ...

Team-aware Robotic Demining Agents for Military Simulation

This document was generated using the LaTeX2HTML translator Version 96.1 (Feb 5, 1996) Copyright © 1993, 1994, 1995, 1996, Nikos Drakos, Computer Based Learning Unit, University of Leeds.

The command line arguments were:
latex2html -split 0 -no_reuse iaai00.tex.

The translation was initiated by Gita Sukthankar on Wed Feb 2 00:01:21 EST 2000

Gita Sukthankar
Wed Feb 2 00:01:21 EST 2000