Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Tracking Objects as Points

About

Tracking has traditionally been the art of following interest points through space and time. This changed with the rise of powerful deep networks. Nowadays, tracking is dominated by pipelines that perform object detection followed by temporal association, also known as tracking-by-detection. In this paper, we present a simultaneous detection and tracking algorithm that is simpler, faster, and more accurate than the state of the art. Our tracker, CenterTrack, applies a detection model to a pair of images and detections from the prior frame. Given this minimal input, CenterTrack localizes objects and predicts their associations with the previous frame. That's it. CenterTrack is simple, online (no peeking into the future), and real-time. It achieves 67.3% MOTA on the MOT17 challenge at 22 FPS and 89.4% MOTA on the KITTI tracking benchmark at 15 FPS, setting a new state of the art on both datasets. CenterTrack is easily extended to monocular 3D tracking by regressing additional 3D attributes. Using monocular video input, it achieves 28.3% AMOTA@0.2 on the newly released nuScenes 3D tracking benchmark, substantially outperforming the monocular baseline on this benchmark while running at 28 FPS.

Xingyi Zhou, Vladlen Koltun, Philipp Kr\"ahenb\"uhl• 2020

Related benchmarks

TaskDatasetResultRank
Multiple Object TrackingMOT17 (test)
MOTA67.8
921
3D Object DetectionnuScenes (test)--
829
Multi-Object TrackingDanceTrack (test)
HOTA0.481
355
Multi-Object TrackingMOT16 (test)
MOTA69.6
228
Multi-Object TrackingSportsMOT (test)
HOTA62.7
199
3D Multi-Object TrackingnuScenes (test)
ID Switches3.81e+3
130
3D Multi-Object TrackingnuScenes (val)
AMOTA6.8
115
2D Multi-Object TrackingKITTI car (test)
MOTA88.83
65
Multi-Object TrackingKITTI Tracking (test)
MOTA89.44
56
Multi-Object TrackingMOT17
MOTA67.8
55
Showing 10 of 37 rows

Other info

Code

Follow for update