Best Paper Award at IVS 2021


Single View Reconstruction of Repetitious Activity Using Longitudinal Self-Supervision

Code/Dataset has been released


Reconstructing 4D vehicular activity (3D space and time) from cameras is useful for autonomous vehicles, commuters and local authorities to plan for smarter and safer cities. Traffic is inherently repetitious over long periods, yet current deep learning-based 3D reconstruction methods have not considered such repetitions and have difficulty generalizing to new intersection-installed cameras.

We present a novel approach exploiting longitudinal (long-term) repetitious motion as self-supervision to reconstruct 3D vehicular activity from a video captured by a single fixed camera. Starting from off-the-shelf 2D keypoint detections, our algorithm optimizes 3D vehicle shapes and poses, and then clusters their trajectories in 3D space. The 2D keypoints and trajectory clusters accumulated over long-term are later used to improve the 2D and 3D keypoints via self-supervision without any human annotation. Our method improves reconstruction accuracy over state of the art on scenes with a significant visual difference from the keypoint detector's training data, and has many applications including velocity estimation, anomaly detection and vehicle counting. We demonstrate results on traffic videos captured at multiple city intersections, collected using our iPhones, YouTube, and other public datasets.

Video Presentation


Our method takes off-the-shelf 2D keypoint detections as input, reconstructs 3D objects for each frame initially, and accumulates them over time. Then, for 2D self-supervision, good keypoints from initial detections are selected as "2D experts" to refine bad 2D keypoints. For 3D, the accumulated 3D trajectories are clustered and the mean trajectories are used as "3D experts" to refine 3D poses. 4D reconstruction could be applied to traffic analysis such as velocity estimation.



We demonstrate our results on traffic videos captured at multiple city intersections from various sources.


We show cameras in Traffic4D dataset collected by us (off-white) and AI City Challenge (gold). [Code] and [Dataset] can be downloaded from the above links

3D Expert Visualization

Visualizing 3D experts with map overlay.

Sequence Traffic4D-001

Multiple Intersections


Comparison between initial reconstruction and longitudinal self-supervision.

Sequence Traffic4D-005

Sequence Traffic4D-007


We demonstrate application to traffic tasks such as velocity estimation, vehicle counting and anomaly analysis.

Velocity Estimation and Vehicle Counting

Sequence Traffic4D-001

Sequence AI City S02_007

Anomaly Detection

Sequence Traffic4D-009

More Details

For an in-depth description of Traffic4D, please refer to our paper.

"Traffic4D: Single View Reconstruction of Repetitious Activity Using Longitudinal Self-Supervision" (Best Paper)

Fangyu Li, N. Dinesh Reddy, Xudong Chen and Srinivasa G. Narasimhan
Proceedings of IEEE Intelligent Vehicles Symposium (IV '21)
[PDF] [Poster] [Slides] [Representative Image] [Code] [Dataset] [Bibtex]

Link to paper