Stable vs. evolving agents

Next: Modeling of others' goals Up: Issues and Techniques Previous: Benevolence vs. competitiveness

Stable vs. evolving agents

Another important characteristic to consider when designing multiagent systems is whether the agents are stable or evolving. Of course evolving agents can be useful in dynamic environments. But particularly when using competitive agents, allowing them to evolve can lead to complications. Such systems that use competitive evolving agents are said to use a technique called competitive co-evolution. Systems that evolve benevolent agents are said to use cooperative co-evolution. The evolution of both predator and prey agents by Haynes and Sen [37] qualifies as competitive co-evolution.

Grefenstette and Daley conduct a preliminary study of competitive and cooperative co-evolution in a domain that is loosely related to the pursuit domain [32]. Their domain has two robots that can move continuously and one morsel of (stationary) food that appears randomly in the world. In the cooperative task, both robots must be at the food in order to ``capture'' it. Since the robots can run out of energy if they move too much, they learn to move towards food only when both of them are near enough to reach it. Evolving populations of decision rules using Genetic Algorithms (GAs), Grefenstette and Daley consider different methods of fitness evaluation. Fitness evaluation--the evaluation of relative ``fitness'' of individuals in a population so that the most fit can be retained and recombined--is an important component of evolutionary learning techniques. Grefenstette and Daley find that an effective method for cooperative co-evolution in their domain is to use separate GAs to evolve rules for the two agents, evaluating individuals against a ``champion'' (individual with highest fitness) from a random generation of the other GA.

In a competitive task in the same domain, agents try to be the first to reach the food [32]. Again, different GA evaluation methods are considered for use in evolving rule sets to control the agents.

One problem to contend with in competitive rather than cooperative co-evolution is the possibility of an escalating ``arms race'' with no end. Competing agents might continually adapt to each other in more and more specialized ways, never stabilizing at a good behavior. Of course in a dynamic environment, it may not be feasible or even desirable to evolve a stable behavior. Applying RL to the iterated prisoner's dilemma, Sandholm and Crites find that a learning agent is able to perform optimally against a fixed opponent [75]. But when both agents are learning, there is no stable solution.

Another issue in competitive co-evolution is the credit/blame assignment problem. When performance of an agent improves, it is not necessarily clear whether the improvement is due to an improvement in that agent's behavior or a negative change in the opponent's behavior. Similarly, if an agent's performance gets worse, the blame or credit could belong to that agent or to the opponent.

One way to deal with the credit/blame problem is to fix one agent while evolving the other and then switch. Of course this method encourages the arms race more than ever. Nevertheless, Rosin and Belew use this technique, along with an interesting method for maintaining diversity in genetic populations, to evolve agents that can play TicTacToe, Nim, and a simple version of Go [66]. When it is a given agent's turn to evolve, it executes a standard GA generation. Individuals are tested against individuals from the competing population, but a technique called ``competitive fitness sharing'' is used to maintain diversity. When using this technique, individuals from agent X's population are given more credit for beating opponents (individuals from agent Y's population) that are not beaten by other individuals from agent X's population. More specifically, the reward to an individual for beating individual y is divided by the number of other individuals in agent X's population that also beat individual y. Competitive fitness sharing shows much promise for people building systems that use competitive co-evolution.

Next: Modeling of others' goals Up: Issues and Techniques Previous: Benevolence vs. competitiveness

Peter Stone
Wed Sep 24 11:54:14 EDT 1997