Incoporating Background Invariance into Feature-Based Object Recognition

Andrew Stein (Home Page) & Martial Hebert

Workshop on Applications of Computer Vision (WACV)
January 2005

Abstract: Current feature-based object recognition methods use information derived from local image patches. For robustness, features are engineered for invariance to various transformations, such as rotation, scaling, or affine warping. When patches overlap object boundaries, however, errors in both detection and matching will almost certainly occur due to inclusion of unwanted background pixels. This is common in real images, which often contain significant background clutter, objects which are not heavily textured, or objects which occupy a relatively small portion of the image. We suggest improvements to the popular Scale Invariant Feature Transform (SIFT) which incorporate local object boundary information. The resulting feature detection and descriptor creation processes are invariant to changes in background. We call this method the Background and Scale Invariant Feature Transform (BSIFT). We demonstrate BSIFT's superior performance in feature detection and matching on synthetic and natural images.

The full paper is available in Postscript (~ 1.5mb) or PDF format (471kb), 8 pages. In addition, my PowerPoint Slides from my presentation at WACV are also available. (Note that this is a 2.5MB, read-only, PowerPoint 2003 file.)

You may also download our database of 110 objects (392kb), their corresponding foreground/background segmentations (42kb), and 25 indoor/outdoor background images (~ 4mb) which were used for the synthetic results described in the paper. The files are all in PNG format. The databases are concatenations of some of our images along with other image/object databases found online (see the paper for references). Here are some example objects:

The following figures demonstrate the benefits of incorporating object boundary information (in this case, hand-labelled boundaries) into both the detection and description phases of feature-based recognition. Each depicts an image of a Sony Aibo(tm) on two different background textures. (These figures are taken directly from the paper.)



Last updated: 1/13/2005