Jamieson Schulte, Chuck Rosenberg, Sebastian Thrun
Increasingly sophisticated robots are entering the domain of people and will require new ways of interacting with them. A service robot, for example, should be able to take directions from a non-expert user that has little prior knowledge of the robot interface. In addition, the robot should be able to interact with people when this helps it achieve its goal, such as moving through a crowded space. Our work addresses the problem of designing interfaces for robots that deal with people spontaneously, and for short periods of time.
The coexistence of humans and robots necessitates new forms of interaction. We expect that people will find robots more useful if they can command them easily and naturally. Robots will perform more sophisticated tasks if people in the environment can be worked with cooperatively, rather than treated as static obstacles. In the past, robots have been confined to specific areas, away from untrained humans, but we are working to place them in the world of people. In addition to direct applications of interactive mobile robots such as office helpers and tour guides, a robot platform with interactive ability opens interesting new areas for robot learning research.
Much research has already been done in the field of human-robot interaction in which the human is trained in the language recognized by the robot. Examples include speech and gesture based direction of robots. Additional work has focused on less formal interaction with disembodied software agents. The topic of spontaneous, short-term interaction with mobile robots has little coverage in robotics literature at present.
Our approach is to augment existing robot platforms with a motorized ``face'' and speech playback capability. Sensors and map information are combined to predict the locations of people in the environment, which is essential for interaction. One such robot, called Minerva, has been used as a tour guide in the crowded Smithsonian National Museum of American History for two weeks. The goals of interaction for this robot are to attract people to whom it can give a tour, and to make progress through crowds of people. We employ three aspects of interaction to make the robot a believable agent that can coexist with people. First, a face is used to define a focal point for interaction, and to express intent and emotional state. The face consists of two video camera eyes with motorized eyebrows, a mouth that can change expression, and an LED display at the mouth that corresponds to emitted speech. Second, the robot is supplied with an ``emotional'' state, expressed outwardly by facial expressions and sounds. Third, adaptation occurs using a memory based learner that experiments with new types of interaction and records the effect.
Most of our work so far has focused on ways in which a robot can influence people in order to achieve specific goals. More work is needed to close the loop, and provide novice users with simple, intuitive control over the robot. Our research on learning is in its very early stages, and should be revisited with long-term experiments based on our findings with Minerva.