MPI-Video Surveillance and Monitoring:
Our Vision

Information System versus Signal Processing

Our distinct contribution to VSAM research is the articulation of VSAM as an information system. This stands in contrast to image and signal processing systems. A signal processing system typically takes signals from one or more sensors and processes the data into a form that is useful for some purpose. In doing so, it may fuse the data from those sensors, but it deals primarily with data at the sensor level. An information system handles a broad range of information, not just sensor data. Sensor data is still important to us, but we consider it in a much larger context.

Signal Processing Approach

Consider the scene in following image. It is an outdoor scene in an urban area, but could be anywhere.

An outdoor urban scene that is the subject of surveillance

There are multiple sensors placed in and around the area for surveillance. In this case there are three video cameras situated on elevated platforms and an airborne infrared sensor in a helicopter, such as those use by police to track suspected felons at night. This leads to the scenario depicted in the image below.

A surveillance system that uses multiple sensors to cover a scene.

While there is effective coverage of the area by the multiple sensors, there is too much information for a single person to assimilate. Fusing data from groups of sensors helps, but the problem gets worse as the number of sensors increases and it is certainly possible to have surveillance tasks that use sensors numbering in the thousands.

MPI-Video - An Information System

Our approach, called Multiple-Perspective Interactive Video (MPI-Video), uses the multiple heterogeneous sensors, integrating the information derived from them to form a comprehensive environment model (EM) of the surveilled area (see figure below).

An MPI-video surveillance system that assimilates sensor information into a single environment model.

The EM contains a dynamically evolving description of the real environment at multiple abstraction levels. The particular description depends on the demands of the application. There are two distinct functions performed by the EM:

Information Assimilation

For the purposes of information assimilation, the EM may be viewed as an elaborate state variable that represents the state of the world. Video-understanding algorithms based on state estimation methods for dynamic systems, perform the assimilation using guidance from other information sources. These other sources may include: all of which are also maintained in the EM. This broad range of information allows the EM to assimilate information in the widest possible context.

The Database

The database is the focus of interaction, i.e., a user of the system interacts with the EM by posing queries and modifying data in the database. By interacting with the database, the user may define:

Other interaction allows the user to obtain responses to queries such as: Interaction with the EM allows a single user to easily manage the abundance of information available from sensors and other sources.

We envision an MPI Video Surveillance and Monitoring (MPI-VSAM) system which is a reconfigurable framework robustly supporting image understanding capabilities in a variety of environments.

MPI VSAM Home Page -- Executive Summary -- Our Vision -- Project Information --
Papers -- Investigators -- Administrative Staff -- Related Links

Web page design by Jeffrey E. Boyd