Vision and Autonomous Systems Seminar
- Gates Hillman Centers
- Traffic21 Classroom 6501
- GERARD PONS-MOLL
- Research Group Leader
- Max-Planck-Institut für Informatik
- Saarland Informatics Campus
Capturing and Learning Digital Humans
The world is shifting towards a digitization of everything -- music, books, movies and news in digital form are common in our everyday lives. Digitizing human beings would redefine the way we think and communicate (with other humans and with machines), and it is necessary for many applications; for example, to transport people into virtual and augmented reality, for entertainment and special effects in movies, and for medicine and psychology.
Currently, digital people models typically lack realism or require time-consuming manual editing of physical simulation parameters. Our hypothesis is that better and more realistic models of humans and clothing can be learned directly by capturing real people using 4D scans, images, and depth and inertial sensors. Combining statistical machine learning techniques and geometric optimization, we create realistic statistical models from the captured data. To be able to digitize people from low-cost ubiquitous sensors (RGB cameras, depth or small number of wearable inertial sensors), we leverage the learned statistical models -- which are robust to noise and missing data.
I will give an overview of a selection of projects where the goal is to build realistic models of human pose, shape, soft-tissue and clothing. I will also present some of our recent work on 3D reconstruction of people models from monocular video, and real-time joint reconstruction of surface geometry and human body shape from depth data. I will conclude the talk outlining the next challenges in building digital humans and perceiving them from sensory data.
Gerard Pons-Moll is the head of the research group "Real Virtual Humans" at the Max Planck for Informatics (MPII) in Saarbrücken, Germany. His research lies at the intersection between computer vision, computer graphics and machine learning -- with special focus on analyzing people in videos, and creating virtual human models by "looking" at real ones. His research has produced some of the most advanced statistical human body models (which are currently used for a number of applications in industry and research), as well as pioneering algorithms to track and reconstruct 3D people models from images, video, depth, and IMUs.
His work has been published at the major computer vision and computer graphics conferences and journals including Siggraph, Siggraph Asia, CVPR, ICCV, BMVC(Best Paper Award 2013), Eurographics(Best Paper Award 2017), IJCV and TPAMI. He serves regularly as a reviewer for TPAMI, IJCV, Siggraph, Siggraph Asia, CVPR, ICCV, ECCV, Eurographics, SCA, ICML, NIPS and others. He co-organized workshops and tutorials on capturing and learning digital models of humans at SIGGRAPH'16, ICCV'11, ICCV'15 and ECCV'18.