Improving Spatial Support for Objects via Multiple Segmentations



Sliding window scanning is the dominant paradigm in object recognition research today. But while much success has been reported in detecting several rectangular-shaped object classes (i.e. faces, cars, pedestrians), results have been much less impressive for more general types of objects. Several researchers have advocated the use of image segmentation as a way to get a better spatial support for objects. In this paper, our aim is to address this issue by studying the following two questions: 1) how important is good spatial support for recognition? 2) can segmentation provide better spatial support for objects? To answer the first, we compare recognition performance using ground-truth segmentation vs. bounding boxes. To answer the second, we use the multiple segmentation approach to evaluate how close can real segments approach the ground-truth for real objects, and at what cost. Our results demonstrate the importance of finding the right spatial support for objects, and the feasibility of doing so without excessive computational burden.

Presentation Slides

BMVC 2007 Slides


Tomasz Malisiewicz, Alexei A. Efros. Improving Spatial Support for Objects via Multiple Segmentations, British Machine Vision Conference (BMVC 2007), September 2007. PDF [BibTeX]


You can find some optimized MATLAB code for merging segments up to size N.

Additional Results

Many more resulting images can be viewed on the additional results page.

Cleaned up MSRC2 Segmentations

The cleaned up MSRC2 segmentations can be downloaded here.
These are in matlab format.  There are 'void' regions inside the MSRC
labelings.  Some parts of an image do not fit into the 23 classes.  If
there is one more element in unique(newseg(:)), then this extra
element is the 'void' region.

labels_with_void = [newlabels -1];
segim = labels_with_void(newseg);

The reason why everything isn't just inside one matrix is that I've
tried to split up the disconnected instances of the same category into
different regions.  This is something very reasonable for evaluating
segmentation algorithms.


This research is supported by: