Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

MOTR: End-to-End Multiple-Object Tracking with Transformer

About

Temporal modeling of objects is a key challenge in multiple object tracking (MOT). Existing methods track by associating detections through motion-based and appearance-based similarity heuristics. The post-processing nature of association prevents end-to-end exploitation of temporal variations in video sequence. In this paper, we propose MOTR, which extends DETR and introduces track query to model the tracked instances in the entire video. Track query is transferred and updated frame-by-frame to perform iterative prediction over time. We propose tracklet-aware label assignment to train track queries and newborn object queries. We further propose temporal aggregation network and collective average loss to enhance temporal relation modeling. Experimental results on DanceTrack show that MOTR significantly outperforms state-of-the-art method, ByteTrack by 6.5% on HOTA metric. On MOT17, MOTR outperforms our concurrent works, TrackFormer and TransTrack, on association performance. MOTR can serve as a stronger baseline for future research on temporal modeling and Transformer-based trackers. Code is available at https://github.com/megvii-research/MOTR.

Fangao Zeng, Bin Dong, Yuang Zhang, Tiancai Wang, Xiangyu Zhang, Yichen Wei• 2021

Related benchmarks

TaskDatasetResultRank
Multiple Object TrackingMOT17 (test)
MOTA78.6
1020
Multi-Object TrackingDanceTrack (test)
HOTA0.542
471
Multiple Object TrackingMOT20 (test)
IDF157.9
426
Multi-Object TrackingSportsMOT (test)
HOTA55.8
253
Multi-Object TrackingMOT17
IDF168.6
104
Multi-Object TrackingBDD100K (val)
mIDF154
70
Multi-Object TrackingMOT17 1.0 (test)
MOTA65.1
48
Multi-Object TrackingMOT16 1.0 (test)
MOTA65.7
21
Multi-Object TrackingDanceTrack 58 (test)
HOTA54.2
20
Multi-Object TrackingMOTChallenge 20 (test)
MOTA58.7
10
Showing 10 of 16 rows

Other info

Code

Follow for update