F.Porikli and Z.Yin, "Temporally Static Region Detection in Multi-Camera Systems," IEEE International Conference on Computer Vision (ICCV) Workshop on Performance Evaluation of Tracking and Surveillance, Rio de Janeiro, Brazil, Oct. 2007 [PDF]


Traditional approaches consider left behind object detection as a tracking application and heavily depend on accurate initialization of objects, which is a performance bottleneck. Here, we present a pixel-based solution that employs dual foregrounds of different scene modalities. We construct separate long- and short-term backgrounds modeled as multilayer, multivariate Gaussian distributions. These backgrounds are adapted online using a Bayesian update mechanism at different learning rates that can be imposed as different frame processing frequencies. In addition, the formulation for color background can be easily extended for the gradient and feature point representations. By comparing the current frame with the background modes, we construct dual foregrounds. We aggregate evidence scores at each camera to provide temporal consistency on the hypotheses inferred from the foregrounds. We fuse the evidence from multiple cameras on a ground plane with the associated confidence scores to eliminate the individual camera failures due to the lighting artifacts. Our method does not require object initialization, tracking, or offline training. It accurately segments objects even if they are fully occluded. Its computational load is low and it readily lends itself to parallelization if further speed improvements are necessary.

Demo videos: