Emotion as part of a Broad Agent Architecture

              W. Scott Reilly and Joseph Bates
                 School of Computer Science
                 Carnegie Mellon University
                    Pittsburgh, PA 15217
                            USA
    joseph.bates@cs.cmu.edu and scott.reilly@cs.cmu.edu

Introduction

The Oz project at CMU is developing technology and art for high quality interactive fiction and virtual worlds. We want to give users the experience of living in dramatically interesting worlds that include moderately competent, emotional agents. We hope Oz will be the basis for one of the first sophisticated AI based art forms.

One of the keys to an artistically engaging experience is for the user to be able to "suspend disbelief". That is, the user must be able to imagine that the world portrayed is real, without being jarred out of this belief by the world's behavior. The automated agents, in particular, mustn't be blatantly unreal. We feel this can best be achieved by producing agents with a broad set of capabilities. For our purpose, these capacities can be as shallow as necessary to allow us to build broad, integrated agents.

Many researchers have investigated specific capabilities that autonomous agents should display, like reactive action, or language, or goal-directed behavior. Very few have looked at how to integrate all of these capabilities into one architecture. Our attempt to create such a broad agent architecture is called Tok. Currently, Tok integrates emotion, some social knowledge and behavior, perception, reactivity, inference, and goal-directed behavior.

While Oz worlds are simulated, they must retain sufficient complexity to serve as interesting artistic vehicles. Because of this, Oz agents must deal with imprecise and erroneous perceptions as well as their inability to effectively model the world they inhabit. We suspect that some of our experience with broad agents in Oz may transfer to other domains, such as real-world robots.

The Tok Architecture

The Tok architecture is our attempt to combine a variety of capabilities into a single agent architecture(*). Our current architecture supports reactivity, goal-directed behavior, emotion, social knowledge and behavior, and perception. We are currently working on integrating some natural language understanding and generation as well. We are willing to sacrifice depth in any of these areas for the broad integration of all of them. In this abstract, we will look briefly at the action and emotion aspects of the architecture and how they are integrated in Tok.

ACTION

Hap is Tok's goal-directed, reactive action engine. It continuously chooses the agent's next action based on perception, current goals, emotional state, behavioral features and other aspects of internal state. Hap is a behavior-based architecture in which behaviors are constructed from subsidiary behaviors and named by goals. Goals do not characterize world states to accomplish and Hap does no explicit planning. Instead, sets of actions (which we nonetheless call "plans") are chosen from an unchanging plan library which may contain one or more plans for each goal. These plans are either ordered or unordered collections of subgoals and actions which can be used to accomplish the invoking goal. Multiple plans can be written for a given goal, distinguished in part by a testable precondition. If a plan fails, Hap will attempt any alternate plans for the given goal, and thus perform a kind of backtracking search in the real world.

Hap stores all active goals and plans in a hierarchical structure called the active plan tree (APT). There are various annotations in the APT to support reactivity and the management of multiple goals. Two important annotations are context conditions and success tests. Both of these are arbitrary testable expressions over the perceived state of the world and other aspects of internal state. Success tests may be associated with selected goals in the APT. When a success test is true, its associated goal is deemed to have been accomplished and thus no longer needs to be pursued. This can happen before the goal is attempted in which case it is skipped or can happen during execution of the goal in which case all of the goal's subsidiary behaviors are terminated.

Similarly, context conditions may be associated with APT plans. When a context condition becomes false, its associated plan is deemed no longer applicable in the current state of the world. That plan fails and is removed from the tree along with any executing subgoals. The parent goal then chooses a new plan or fails.

Hap executes by first modifying the APT based on changes in the world. Goals whose success test is true and plans whose context condition is false are removed along with any subordinate subgoals or plans. Next one of the leaf goals is chosen. If the chosen goal is a primitive action, it is executed. Otherwise, the plan library is indexed and the plan arbiter chooses a plan for this goal from among those whose preconditions are true. The plan arbiter will not choose plans which have already failed, and prefers more specific plans over less specific ones. At this point the execution loop repeats.

EMOTION

The Em system models selected emotional and social aspects of Tok agents. The emotion generation of Em is based on ideas of Ortony, Clore, and Collins. Like that work, Em develops emotions from a cognitive base: external events are compared with goals, actions are compared with standards, and objects are compared with attitudes. Currently, Em supports roughly 20 emotions types.

Joy and distress occur when the agent's goals succeed or fail or become more likely to succeed or fail. Hope and fear occur when Em believes that there is some reasonable chance of a goal succeeding or failing. Fears-confirmed occurs when the fear of a goal failure becomes a reality. Satisfaction occurs when a hoped for goal succeeds. Disappointment is caused by a hoped for goal failing and relief occurs when fear of a goal failure is released by the goal succeeding. All of these emotions, as well as the ones described below, take a number of other factors into account when determining if emotions are actually generated and how intense the emotions are if they are generated. For example, two of the factors affecting the generation of joy are the importance of the goal in question and the expectedness of the success.

Pride, shame, reproach, and admiration arise when an action is either approved or disapproved. These judgments are made according to the agent's standards, which represent moral beliefs and personal standards of performance. Pride and shame occur when the agent itself performs the action; admiration and reproach develop in response to others' actions.

Some emotions are combinations of other emotions. For example, if an agent performs an action that causes a second agent to feel distress and that the second agent disapproves of, the second agent will feel anger at the first agent. Similarly, gratitude is a composite of happiness and admiration, remorse is sadness and shame, and gratification is happiness and pride.

Finally, love and hate arise from noticing objects toward which the agent has positive or negative attitudes. Our current set of attitudes includes only like and dislike, as in the Ortony, Clore, and Collins work. We are currently working on expanding this set.

Emotions fade with time, but attitudes and standards are fairly stable. An agent will feel love when close to someone liked. This fades if the other agent leaves, but the attitude toward that agent remains relatively stable. When attitudes change, it is usually due to emotional influences. For axample, I will gradually learn to dislike agents that frequently cause me to be angry.

This set of emotion generation rules is extensible, which allows for easy addition of new emotions and new ways of generating the current emotions. For example, we plan to add a new emotion called frustration, which would occur when an agent's plan to achieve a goal fails but the goal is still achievable. Also, rules can be added so that agents will feel distress from remembering past distressing events. Besides cognitive appraisals and memories of previous emotional experiences, other ways emotions might arise include: reflexes, daydreaming, social contagion, and body feedback.

Once emotions have been generated, they are stored in a tree structure. The intensities of the emotions in the tree are propagated up the tree in customizable ways. For example, the distress node in the tree will have a number of children, such as homesick and lovesick, whose intensity is combined in a positive manner in the distress node. This means that different types of goal failures can be stored separately, but at the same time combine in such a way as to have an overall distressing effect on the agent.

It is useful to record homesick and lovesick separately so that behaviors specific to the failing goal can be undertaken. For example, when homesick the agent might think about his family. When lovesick, the agent might daydream about spending time with the missed loved one. Also, as the intensity from these emotions is passed up the tree, general distress behaviors can also be triggered, such as crying or acting lethargicly. We believe that it is important to account for both of these kinds of emotional behavior: behaviors that are specific to the cause of the emotion and behaviors that are general and can be attributed to the type of the emotion.

One of our design decisions in creating Em was to represent emotions as explicit states in conjunction with emotion mechanisms. This is at odds with some other emotion theories that view emotions as pure mechanisms or interrupts or perturberances. We decided to treat emotions as structures for purely pragmatic reasons; it makes a number of simulation problems more straightforward. Some of these problems are: creating agents that speak about their emotions, integrating emotions with other Tok subsystems (e.g., language), modeling emotional intensity and decay, testing models of how different emotions combine and influence each other, storing emotional experiences, and getting authors to understanding our emotional system. Understanding the system is easier because the view of emotions as states fits better with most folk theories about emotions.

COMPONENT INTEGRATION

There are interesting issues of communication in both directions between the action and emotion parts of Tok. Many of the emotions are driven from the current state of the agent's goals, which are part of the action system. Furthermore, the emotions influence a number of different parts of the action mechanism.

The effects of emotion on action are modulated by behavioral features. These features represent the general way that an agent is acting and can be set by either Hap or Em. An example of a behavioral feature is "aggressive". This feature will often occur when an agent is angry, but it may also arise as an outcome of fear in some agents or it might be part of an agent's plan to pretend to be angry.

Features may influence several aspects of Hap's execution. They may trigger demons that create new top-level goals. They may occur in the preconditions, success tests, and context conditions of plans, and so influence how Hap chooses to achieve its goals. Finally, they may affect the precise style in which an action is performed.

The feature mechanism, while very ad hoc, appears to provide a useful degree of abstraction in the interface between emotion and behavior. It is not merely a mechanism to vary Tok's behavior and thereby possibly increase the appearance of richness. Rather, it is an initial solution to the integration problem of driving behavior from both goals and emotion.

Through this tight action-emotion integration we are able to make emotion an integral part of the action system. We do not view the architecture as having a rational element and an emotional element fighting to control the action selection mechanism. Rather, we use all the information the agent has from emotions, reasoning processes, and the agent's personal preferences in making integrated decisions about what goals and plans to execute. This is somewhat at odds with introspection which tells us that emotions sometimes take control away from our reason in order to produce unwanted behaviors or memories. We may have to resolve this conflict between our model and conscious experience at some time in the future.

It should be kept in mind that similar issues are involved in integrating the rest of the Tok architecture that we are not presenting here. For example, to add social knowledge we need to consider such issues as the social constraints and influences on behavior, the effect of goal successes and failures on models of other agents, the social basis for many emotions, and the effects of emotions on relationships with other agents.

From an emotion standpoint, integration is especially important as emotions are only useful to the degree that they affect other systems. For instance, just being afraid isn't very interesting if the agent doesn't act on that fear. In addition to the influence emotions have over the action system, examples of other ways emotions influence an agent include distributing physical resources (e.g., adrenaline rush and muscle tensing), modifying the inferences the agent makes, helping to initiate learning, and modifying social relationships and models of other agents.

The Woggles

Our latest version of the Tok architecture is being used to control "The Woggles", three animated creatures that live in a simulated world and interact with the user (who controls a fourth Woggle) and each other in real time.

Woggles have many different types of goals. In addition to action goals (e.g., play a game), there are also goals that do sensing (e.g., notice when two Woggles are fighting), memory (e.g., remember what that Woggle is doing), emotion (e.g., assign blame and get angry when a Woggle causes an important goal to fail) and keep track of social relationships and attitudes (e.g., like a Woggle less if it starts a fight).

With our current technology, Woggles are able to process on the order of one hundred goals simultaneously. Which goals are active at any time depends on the initial configuration of the Woggle as well as the current internal and external states. Internal causes of goal activation include being tired or being afraid. External causes of goals include noticing another Woggle asking to play a game or wanting to break up a fight between two other Woggles.

The goals are prioritized based on the context in which they arise. The mind works on as many goals as possible within the given real-time constraints, starting with higher priority goals. If different goals give rise to compatible actions, then the Woggle executes these different actions simultaneously. So, a Woggle can jump and turn and notice a fight all at the same time.

Woggles also use selective perception. Woggles can sense much about their world that isn't useful to them, but it is advantageous to only sense important things, especially as sensing is fairly expensive. This is complicated, however, because what is important changes over time. We handle this by having automatic mechanisms for turning sensors on and off as needed by the active goals and behaviors.

We believe that the Woggles provide evidence for the feasibility of our research program. They integrate a broad set of capabilities, none of which are modeled very deeply, yet informal evidence from hundreds of users suggest that they are believable, engaging agents.


(*) This is joint work with Bryan Loyall. The other members of the Oz group are Mark Kantrowitz, Phoebe Sengers, and Peter Weyhrauch.