State-of-the-art approaches for articulated human pose estimation are rooted in parts-based graphical models. These models are often restricted to tree-structured representations and simple parametric potentials in order to enable tractable inference. However, these simple dependencies fail to capture all the interactions between body parts. While models with more complex interactions can be defined, learning the parameters of these models remains challenging with intractable or approximate inference. In this paper, instead of performing inference on a learned graphical model, we build upon the inference machine framework and present a method for articulated human pose estimation. Our approach incorporates rich spatial interactions among multiple parts and information across parts of different scales. Additionally, the modular framework of our approach enables both ease of implementation without specialized optimization solvers, and efficient inference. We analyze our approach on two challenging datasets with large pose variation and outperform the state-of-the-art on these benchmarks.
Varun Ramakrishna is a PhD student in the Robotics Institute, advised by Prof. Yaser Sheikh and Prof. Takeo Kanade. His research interests include structured prediction problems in computer vision with a focus on understanding human posture and motion from monocular images and image sequences. Varun was previously a master's student in the ECE department at CMU and earned his undergraduate degree from IIT Madras.
Host: Kris Kitani