We describe a
tracker that can track moving people in long sequences without manual
initialization. Moving people are modeled with the assumption that,
while configuration can vary quite substantially from frame to frame,
appearance does not. This leads to an algorithm that firstly builds a
model of the appearance of the body of each individual by clustering
candidate body segments, and then uses this model to find all
individuals in each frame. Unusually, the tracker does not rely on a
model of human dynamics to identify possible instances of people; such
models are unreliable, because human motion is fast and large
accelerations are common. We show our tracking algorithm can be
interpreted as a loopy inference procedure on an underlying Bayes net.
Experiments on video of real scenes demonstrate that this tracker can
(a) count distinct individuals; (b) identify and track them; (c)
recover when it loses track, for example, if individuals are occluded
or briefly leave the view; (d) identify the configuration of the
body largely correctly; and (e) is not dependent on particular models
of human motion. Ramanan, D., Forsyth, D. A. "Finding and Tracking People From the Bottom Up." Computer Vision and Pattern Recognition (CVPR), Madison, WI, June 2003. [pdf] Ramanan, D., Forsyth, D. A., Zisserman, A. " Tracking People by Learning their Appearance" IEEE Pattern Analysis and Machine Intelligence (PAMI), Accepted for publication. [pdf] |
A tarfile of test frames from the weave sequence is here.
The following movies are DIVX encoded.
Walk | Jumping Jacks | Weave Run |