Self-supervised Multi-view Person Association and Its Applications


Reliable markerless motion tracking of multiple people participating in complex group activity from multiple moving cameras is challenging due to frequent occlusions, strong viewpoint and appearance variations, and asynchronous video streams. To solve this problem, a reliable association of the same person across distant viewpoints and temporal instances is essential. We present a simple but powerful self-supervised framework to adapt a generic person appearance descriptor to the unlabeled videos by exploiting motion tracking, mutual exclusion constraints, and multi-view geometry. The adapted discriminative descriptor enables a tracking-by-clustering formulation. We validate the effectiveness of our descriptor learning on WILDTRACK and three new complex social scenes captured by multiple cameras with up to 60 people "in the wild". We report significant improvement in association accuracy (up to 18%) and stable and coherent 3D human skeleton tracking (5 to 10 times) over the baseline. Using the reconstructed 3D skeletons, we cut the input videos into a multi-angle video where the image of a specified person is shown from the best visible front-facing camera. Our algorithm detects inter-human occlusion to determine the camera switching moment while still maintains the flow of the action well.


"Automatic Adaptation of Person Association for Multiview Tracking in Group Activities"
Minh Vo, Ersin Yumer, Kalyan Sunkavalli, Sunil Hadap, Yaser Sheikh, and Srinivasa Narasimhan
PDF]Dataset: [Tagging][Chasing]


Discriminative and automatically domain adaptive person appearance descriptor enables the use of clustering for multiview people tracking. This is achieved by combining motion tracking, mutual exclusion constraints, and multiview geometry in a multitask learning framework to automatically adapt a generic person appearance descriptor to the domain videos

Full results


Human association results on Chasing and Tagging


3D Tracking on WILDTRACK


Semantic-aware director cut



This research is supported by the NSF CNS-1446601, the ONR N00014-14-1-0595, the Heinz Endowments "Plattform Pittsburgh", a Qualcomm Innovation Ph.D. fellowship, and Adobe "Unrestricted Research Gift".