Estimating the 6D pose, i.e., 3D rotation and 3D translation, of objects relative to the camera from a single input image has attracted great interest in the computer vision community. Recent works typically address this task by training a deep network to predict the 6D pose given an image as input. While effective on standard benchmarks, these methods still struggle when facing some challenging, yet realistic conditions. In particular, these include high levels of occlusion, large depth variations across the images, the lack of annotated training data, and generalization to previously-unseen objects. In this talk, I will present some of our recent work to tackle these challenges for both RGB- and point-coud-based 6D pose estimation.
Mathieu Salzmann is a Senior Researcher at EPFL-CVLab, and, since May 2020, an Artificial Intelligence Engineer at ClearSpace (50%). Previously, he was a Senior Researcher and Research Leader in NICTA’'s computer vision research group. Prior to this, from Sept 2010 to Jan 2012, he was a Research Assistant Professor at TTI-Chicago, and, from Feb 2009 to Aug 2010, a postdoctoral fellow at ICSI and EECS at UC Berkeley under the supervision of Prof. Trevor Darrell. He obtained his PhD in Jan 2009 from EPFL under the supervision of Prof. Pascal Fua. His research interests include domain adaptation in machine learning, deep network compression and architecture search, and deep learning for 3D computer vision.
The VASC Seminar is sponsored in part by Facebook Reality Labs.
Zoom Participation. See announcement.