Paper:

Z. Yin and R. Collins, "Online Figure-Ground Segmentation with Edge Pixel Classification," 19th British Machine Vision Conference (BMVC), September 2008. [PDF]

Abstract:

The need for figure-ground segmentation in video arises in many vision problems like tracker initialization, accurate object shape representation and drift-free appearance model adaptation. This paper uses a 3D spatio-temporal Conditional Random Field (CRF) to combine different segmentation cues while enforcing temporal coherence. Without supervised parameter training, the weighting factors for different data potential functions in the CRF model are adapted online to reflect changes in object appearance and environment. To get an accurate boundary based on the 3D CRF segmentation result, edge pixels are classified into three classes: foreground, background and boundary. The final foreground region bitmask is constructed from the foreground and boundary edge pixels. The effectiveness of our approach is demonstrated on several airborne videos where objects undergo large appearance change and heavy occlusion.

Results (click images for videos):