Many traditional challenges in reconstructing 3D motion, such as matching across wide baselines and handling occlusion, reduce in significance as the number of unique viewpoints increases. However, to obtain this benefit, a new challenge arises: estimating precisely which cameras observe which points at each instant in time. We present a maximum a posteriori (MAP) estimate of the time-varying visibility of the target points to reconstruct the 3D motion of an event from a large number of cameras. Our algorithm takes, as input, camera poses and image sequences, and outputs the time-varying set of the cameras in which a target patch is visible and its reconstructed trajectory. We model visibility estimation as a MAP estimate by incorporating various cues including photometric consistency, motion consistency, and geometric consistency, in conjunction with a prior that rewards consistent visibilities in proximal cameras. An optimal estimate of visibility is obtained by finding the minimum cut of a capacitated graph over cameras. We demonstrate that our method estimates visibility with greater accuracy, and increases tracking performance producing longer trajectories, at more locations, and at higher accuracies than methods that ignore visibility or use photometric consistency alone.
Publication
Hanbyul Joo, Hyun Soo Park, and Yaser Sheikh. MAP Visibility Estimation for Large-Scale Dynamic 3D Reconstruction, In CVPR, 2014. (Oral Presentation)
[Paper, BibTeX, Slide (pdf), Poster (pdf), CVPR Talk]
Dataset
All the input data and reconstruction results are available in this dataset page.
This material is based upon work supported by the National Science Foundation under Grants No. 1353120 and 1029679. Hanbyul Joo was supported, in part, by the Samsung Scholarship.