Relevant training data. While large, manually collected datasets exist, the captured variations w.r.t. appearance, shape and pose are often uncontrolled thus limiting the overall performance. To overcome this limitation we propose a technique to extend an existing training set that allows to explicitly control pose and shape variations. For this we build on recent advances in computer graphics to generate samples with realistic appearance and background while modifying body shape and pose. We validate the effectiveness of our method on the task of articulated people detection and demonstrate significant improvements over using real training data alone.
Expressive models. Despite high variability of body articulations, human motions and activities often simultaneously constrain the positions of multiple body parts. We propose a model that incorporates higher order part dependencies while remaining efficient. We achieve this by defining a conditional model in which all body parts are connected a-priori, but which becomes a tractable tree-structured pictorial structures model once the image observations are available. Combination of the proposed model with strong appearance representations outperforms other pose estimation methods on standard benchmarks.
Novel benchmark. We introduce a novel benchmark which makes a significant advance in terms of diversity and difficulty of human pose estimation and activity recognition challenges. Our comprehensive dataset was collected using an established taxonomy of over 800 human activities. The collected images cover a wider variety of human activities than previous datasets. This dataset captures people in a wider range of environments and having high diversity of poses, appearances and viewpoints. We perform a detailed analysis of the leading human pose estimation and activity recognition approaches gaining insights for the success and failures of these methods.
Leonid Pishulin received M.Sc. in Computer Science from RWTH Aachen University, Germany, in 2010, and joined the group of Bernt Schiele at the Max Planck Institute for Informatics as a PhD student. The focus of my work is mainly on human pose estimation and articulated people detection, while recently I have been working on human activity recognition and 3D human shape modeling.
kkitani [atsymbol] cs.cmu.edusmatvey [atsymbol] cs.cmu.edu ()