Vision and Autonomous Systems Seminar

  • Remote Access - Zoom
  • Virtual Presentation - ET
Seminars

3D Recognition with self-supervised learning and generic architectures

Supervised learning relies on manual labeling which scales poorly with the number of tasks and data. Manual labeling is especially cumbersome for 3D recognition tasks such as detection and segmentation and thus most 3D datasets are surprisingly small compared to image or video datasets. 3D recognition methods are also fragmented based on the type of 3D input representation - voxels, points or meshes and typically require manual design - for example, manually encoding various 3D radius values dependent on the 3D dataset. In this talk I will present our recent efforts (to appear at ICCV’21) in these two directions. I’ll first talk about DepthContrast which is a simple self-supervised learning method for pre-training 3D architectures. DepthContrast can work with various types of 3D data - single-view, multi-view, indoor/outdoor, voxels/points and shows state-of-the-art performance compared to specialized self-supervised methods. Second, I’ll present 3DETR which is a generic and simple framework for 3D object detection using Transformers. 3DETR is much simpler to implement than current 3D detection methods such as VoteNet as it relies on significantly fewer handcrafted 3D operators, losses. 3DETR shows strong performance on 3D detection benchmarks.

Ishan Misra is a Research Scientist at Facebook AI Research (FAIR) where he works on Computer Vision and Machine Learning. His research interest is in reducing the need for supervision in visual learning. He finished my PhD at the Robotics Institute at Carnegie Mellon University where he worked with Martial Hebert and Abhinav Gupta. His Ph.D. Thesis was titled “Visual Learning with Minimal Human Supervision” for which hereceived the SCS Distinguished Dissertation Award (Runner Up) 2018.

Zoom Participation. See announcement.

Sponsored in part by Facebook Reality Labs Pittsburgh

For More Information, Please Contact: 
Keywords: