Two-Granularity Tracking: Mediating Trajectory and Detection Graphs for Tracking under Occlusions

Katerina Fragkiadaki ¹ Weiyu Zhang ¹ Geng Zhang ² Jianbo Shi ¹

¹CIS, UPenn ² X'ian University

Abstract We want to segment and track objects occluding each other in crowded scenes. We propose a tracking framework that mediates grouping cues from two levels of tracking granularities: coarse-grain detection tracklets and fine-grain point trajectories. Each tracking granularity proposes corresponding grouping cues: trajectories with similar long-term motion and disparity attract each other, detections overlapping in time repulse each other. Tracking is formulated as selection-clustering in the joint detection and trajectory space. Affinities of trajectories and detections will be contradictory in cases of false alarm detections or accidental motion similarity of trajectories. We resolve such contradictions in a steering-clustering framework where confident detections change trajectory affinities, by inducing repulsions between trajectories claimed by repulsive detection tracklets. Two-granularity tracking offers a unified representation for object segmentation and tracking independent of what objects to track, how occluded they are, whether monocular or binocular input or whether camera is moving or not.

Steered co-clustering of detection tracklets and trajectories

Detection tracklets and point trajectories: complementary for tracking/segmentation.

Detections capture objects when they are mostly visible. They may be sparse in time, may miss partially occluded or deformed objects, or contain false positives.
Point trajectories are dense in space and time. Their affinities integrate long range motion and 3D disparity information, useful for segmentation. Affinities may leak though across similarly moving objects.

Two-Granularity joint graph
We formulate object tracking as co-clustering in the joint space of detection tracklets and point trajectories. We consider a joint detection and trajectory graph and establish:

Affinities between detection tracklets according to appearence similarity and motion smoothness.
Repulsions between detection tracklets overlapping in time.
Affinities between trajectories according to long term motion/ disparity similarity during their time overlap.
Detection to trajectory associations according to detection and trajectory long-term overlap.

The resulting joint graph suffers from: 1) false alarm detection tracklets that erroneously claim trajectories 2) affinity contradictions between trajectory affinities and detection tracklet repulsions in cases of accidental motion similarity, which confuse the co-clustering.

Steering Cut
We iteratively sample detection tracklets according to confidence. We steer trajectory affinities and associations to comply with the repulsions of the selected detectlets.

Clustering in the steered graph provides the space time object clusters.

Results - Code

The latest version of the source code can be downloaded here. Please report comments/bugs to katef at seas.upenn.edu .

Dataset

The UrbanStreet dataset used in the paper can be downloaded here [188M] . It contains 18 stereo sequences of pedestrians taken from a stereo rig mounted on a car driving in the streets of Philadelphia during rush hours. The image resolution is 516x1024. Ground-truth is provided in the form of pedestrian segmentation masks for the left view. All pedestrians larger than 100 pixels in height are labelled every 4 frames (0.6 seconds) in each video sequence. The video below shows ground-truth label samples.

Paper

Two Granularity Tracking: Mediating Trajectory and Detection Graphs for Tracking under Occlusions Katerina Fragkiadaki, Weiyu Zhang, Geng Zhang, and Jianbo Shi in ECCV 2012 paper | poster

Last update: Dec, 2012.