Building computer vision systems that understand 3D shape are important for applications including autonomous vehicles, graphics, and VR / AR. If we assume 3D shape supervision, we can now build systems that do a reasonable job at predicting 3D shapes from images. However, 3D supervision is difficult to obtain at scale; therefore we should aim to develop methods that can make 3D predictions given only 2D supervision. In this talk I will briefly review our Mesh R-CNN system for making 3D predictions given 3D supervision. We will then discuss differentiable rendering as a powerful tool that lifts the restriction of 3D supervision, and I will describe the modular and efficient mesh and point cloud renderers provided by our PyTorch3D library. We will then discuss two applications of our renderers: single-image shape prediction, and single-image view synthesis, both trained using two-view 2D supervision.
Justin Johnson is an Assistant Professor of Computer Science and Engineering at the University of Michigan, Ann Arbor and a Visiting Scientist at Facebook AI Research. He completed his PhD at Stanford University, advised by Fei-Fei Li. His research interests lie primarily in computer vision and include visual reasoning, vision and language, 3D perception, and differentiable rendering.
Sponsored in part by Facebook Reality Labs Pittsburgh
Zoom Participation. See announcement.