Carnegie Mellon University: Situationally Appropriate Interaction

Overview
People
Research

Interruption Study

Kandinsky

Portrait

Visualizing Road Information
Publications

Contact

Situationally Appropriate Interaction

Interruption Study

When seeking to talk to a co-worker, a person can generally almost instantly tell that "now is a bad time". Unfortunately, current computers and communications devices cannot do this. Instead they act blindly without regard to the situation they are in. This results in technology that annoys (e.g., cell phones ringing in important meetings) and/or is turned off, thus losing much of its potential benefit. To overcome this problem, work done in the Human Computer Interaction Institute at Carnegie Mellon University has sought to find ways to observe users with sensors such as cameras and microphones in order to create estimates of current interruptibility.

This work has collected audio and video recordings -- similar to what one would see standing at the door to an office -- from a number of users over several weeks. At the same time, these users were periodically prompted for self-reports of their current interruptibility. These recordings were then viewed and coded by a person to simulate a wide range of sensors indicating things such as whether someone is talking, how many people are present, is the phone off-hook, is the keyboard or mouse being used, etc. Sophisticated statistical techniques were then used to determine which of these sensors might be predictive of interruptibility. Using machine learning techniques several statistical models which can be driven by those sensors have been created.

The result has been an estimator of human interruptibility which performs significantly better (82.4% vs. 76.9%) than human subjects making judgments from the same audio and video recordings. Further, it has been shown that a very small number of simple sensors -- a microphone, keyboard and mouse activity, and detecting when the phone is off-hook -- can produce estimates equal to that of humans. Finally, it has been shown that these result can be obtained using real sensors with substantial error rates. In particular, these results have been maintained using a real sensor based on processing the original (poor quality) audio recordings, even though this sensor produces 20% error in its own results.

This result may lead to systems in a wide range of contexts which begin to exhibit what we might think of as a basic level of politeness -- not simply barging into any situation, but acting appropriately when "now is a bad time". This has the potential to significantly improve how a wide range of information technology interacts with people, and to reduce the effects of information overload.

Overview | People | Research | Publications | Contact