Data-Driven 3D Primitives for Single Image Understanding




What primitives should we use to infer the rich 3D world behind an image? We argue that these primitives should be both visually discriminative and geometrically informative and we present a technique for discovering such primitives. We demonstrate the utility of our primitives by using them to infer the 3D surface normals given a single image. Our technique substantially outperforms the state-of-the-art and shows improved cross-dataset performance.


David F. Fouhey, Abhinav Gupta, Martial Hebert.
Data-Driven 3D Primitives for Single Image Understanding.
In Proc. International Conference on Computer Vision. 2013.

Extended Results

We are providing a number of documents as supplemental material:


We will provide two versions of the code. One provides a black-box version of the system that can be easily plugged into other scene understanding tasks. The other is the version used internally that includes training code.
We also have precomputed results for many indoor scene understanding datasets. Please contact David Fouhey for these.
  1. Prediction Only (Now Available)
    [Code (2.9MB .zip), version 1.02, updated 8/14/2014]
    [Data (926MB .tar.gz)]
    [Model for standard train/test (1.5GB zip)]
    This is a streamlined version of the prediction code and a model pre-trained on the NYU v2 dataset (be careful which one you download -- one is trained on the standard train/test split, the other is not!).

    This can also be used as a feature in other vision tasks.
  2. Training Code (New!)


