Time: Sunday September 7, 2014 – Afternoon (2PM)
What does it mean to understand an image? The bounding-box or segment-level understanding produced by many current computer vision systems tells us little about where objects are located in 3D and how agents like humans could interact with them. However, recent work has focused on obtaining a complementary and geometric understanding of the scene in terms of the 3D volumes and surfaces that compose the scene and their interactions. This representation enables reasoning about the objects as they exist in a 3D world, rather than simply in the image plane, and has been demonstrated to have a myriad of applications for object detection, human-centric understanding, and graphics. Additionally, recent data-set collection efforts with depth cameras have made large-scale learning of these geometric representations possible and have opened up exciting avenues for research on large-scale learning with RGB-D datasets.
The tutorial organizers will summarize the state-of-the-art in 3D scene understanding in a half day tutorial. Participants will learn the fundamentals of 3D scene understanding with the aim of enabling its application to traditional 2D image tasks as well as research on the topic itself.
Basic knowledge of machine learning is required; basic knowledge of perspective geometry would be helpful but not required.
Many of the topics covered in this tutorial are also discussed in Representations and Techniques for 3D Object Recognition and Scene Interpretation (D. Hoiem and S. Savarese, Morgan & Claypool Publishers 2011)