We address the task of inferring the future actions of people from noisy visual input. We denote this task activity forecasting. To achieve accurate activity forecasting, our approach models the effect of the physical environment on the choice of human actions. This is ac- complished by the use of state-of-the-art semantic scene understanding combined with ideas from optimal control theory. Our unified model also integrates several other key elements of activity analysis, namely, destination forecasting, sequence smoothing and transfer learning. As proof-of-concept, we focus on the domain of trajectory-based activity analysis from visual input. Experimental results demonstrate that our model accurately predicts distributions over future actions of individu- als. We show how the same techniques can improve the results of tracking algorithms by leveraging information about likely goals and trajectories.
Kris M. Kitani, Brian Ziebart, James D. Bagnell and Martial Hebert.
European Conference on Computer Vision (ECCV 2012), October 2012.
Best Paper Award Honorable Mention ECCV 2012