• idea
  • results
  • downloads
  • citation
  • paper


Blocks World Revisited

"the perception of solid objects is a process which can be based on the properties of three-dimensional transformations and the laws of nature"

-Larry Roberts (1965)


Idea


Since most current scene understanding approaches operate either on the 2D image or using a surface-based representation, they do not allow reasoning about the physical constraints within the 3D scene. Inspired by the "Blocks World" work in the 1960's, we present a qualitative physical representation of an outdoor scene where objects have volume and mass, and relationships describe 3D structure and mechanical configurations. Our representation allows us to apply powerful global geometric constraints between 3D volumes as well as the laws of statics in a qualitative manner. We also present a novel iterative "interpretation-by-synthesis" approach where, starting from an empty ground plane, we progressively ``build up'' a physically-plausible 3D interpretation of the image. For surface layout estimation, our method demonstrates an improvement in performance over the state-of-the-art geometric context algorithm. But more importantly, our approach automatically generates 3D parse graphs which describe qualitative geometric and mechanical properties of objects and relationships between objects within an image.

Global Constraints


  • Static Equilibrium: Under the static world assumption, the forces and torques acting on a block should cancel out (Newton's first law).
  • Support Force Constraint: A supporting object should have enough strength to provide contact reactionary forces on the supported objects.
  • Volumetric Constraints: All the objects in the world must have fi nite volumes and cannot inter-penetrate each other.


Results

3D parse graphs automatically generated by our system for all 250 test images are available in the 3D Parse Graphs Gallery .

Downloads

Dataset

We used the Geometric Context dataset. This dataset can be downloaded from here . The ground-truth segmentations can also be downloaded from here.

Code

Download the blocks world code. Please cite the paper if you are using the code.

Citation

Abhinav Gupta, Alexei A. Efros and Martial Hebert, Blocks World Revisited: Image Understanding Using Qualitative Geometry and Mechanics, European Conference on Computer Vision, 2010. (PDF)


Bibtex Reference

@inproceedings{GuptaEfrosHebert_ECCV10,
   author="Abhinav Gupta and Alexei A. Efros and Martial Hebert",
   title="Blocks World Revisited: Image Understanding Using Qualitative Geometry and Mechanics",
   booktitle="European Conference on Computer Vision(ECCV)",
   year="2010",
}

Acknowledgements

This research was supported by NSF Grant IIS-0905402 and Guggenheim Fellowship to Alexei Efros.

From the 3D Parse Graphs Gallery

  • Thumbnail Example
  • Thumbnail Example
  • Thumbnail Example

Useful Links

  • Paper (PDF)
  • Parse Graphs Gallery (New)
  • Presentation (PPT)
  • Code (New)

The documents contained in these directories are included by the contributing authors as a means to ensure timely dissemination of scholarly and technical work on a non-commercial basis. Copyright and all rights therein are maintained by the authors or by other copyright holders, notwithstanding that they have offered their works here electronically. It is understood that all persons copying this information will adhere to the terms and constraints invoked by each author's copyright. These works may not be reposted without the explicit permission of the copyright holder, except when identified by Creative Commons License 2.0, in which case the license applies to both the original and modified versions of the images.