Vision and Autonomous Systems Center Seminar

  • Remote Access Enabled - Zoom
  • Virtual Presentation
  • Assistant Professor, Department of Electrical Engineering and Computer Science
  • University of California at Berkeley
  • and Research Scientist, Google Research

Perceiving 3D Human-Object Spatial Arrangements from a Single Image In-the-wild

We live in a 3D world that is dynamic—it is full of life, with inhabitants like people and animals who interact with their environment through moving their bodies. Capturing this complex world in 3D from images has a huge potential for many applications such as compelling mixed reality applications that can interact with people and objects in images, novel content creation tools for artists, robots that can learn to act by visually observing people, and other applications in biometrics, animal behavior sciences, and more. While there has been rapid progress in the problem of perceiving 3D humans from images and videos, much of the work focuses on 3D human perception alone, independent from its environment including other objects that humans may interact with. This is particularly so on images captured in uncontrolled, "in-the-wild" settings, like those in COCO dataset, as there are no ground truth 3D labels available for humans and objects. In this talk I will discuss our recent project that recovers 3D mesh of humans and objects from a single image captured in such an uncontrolled environment. Specifically, we recover the spatial arrangements and 3D shapes of humans and objects in a globally consistent 3D scene. I will discuss challenges due to the scale ambiguities of objects and discuss our approach where the key insight is that considering humans and objects jointly gives rise to "3D common sense" constraints that can be used to resolve ambiguity. 

I would like to take this opportunity to talk about this project in depth that is in collaboration with Jason Y. Zhang and Deva Ramanan at CMU along with folks at FAIR. Questions and discussions on future directions welcome!

Angjoo Kanazawa is an Assistant Professor in the Department of Electrical Engineering and Computer Science at the University of California at Berkeley. She is also a research scientist at Google Research, collaborating with Noah Snavely. Previously, she was a BAIR postdoc at UC Berkeley advised by Jitendra Malik, Alexei A. Efros and Trevor Darrell. She completed her PhD in CS at the University of Maryland, College Park with her advisor David Jacobs. Prior to UMD, she obtained her BA in Mathematics and Computer Science. She has also spent time at the Max Planck Institute for Intelligent Systems with Michael Black. Her research is at the intersection of computer vision graphics and machine learning, focusing on 3D reconstruction of deformable objects such as humans and animals from everyday photographs and video.

Faculty Host:  Fernando De La Torre Frade

The VACS Seminar is sponsored in part by Facebook Reality Labs Pittsburgh

Zoom Participation. See announcement.

For More Information, Please Contact: