Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Contrastive Learning for Multi-Object Tracking with Transformers

About

The DEtection TRansformer (DETR) opened new possibilities for object detection by modeling it as a translation task: converting image features into object-level representations. Previous works typically add expensive modules to DETR to perform Multi-Object Tracking (MOT), resulting in more complicated architectures. We instead show how DETR can be turned into a MOT model by employing an instance-level contrastive loss, a revised sampling strategy and a lightweight assignment method. Our training scheme learns object appearances while preserving detection capabilities and with little overhead. Its performance surpasses the previous state-of-the-art by +2.6 mMOTA on the challenging BDD100K dataset and is comparable to existing transformer-based methods on the MOT17 dataset.

Pierre-Fran\c{c}ois De Plaen, Nicola Marinello, Marc Proesmans, Tinne Tuytelaars, Luc Van Gool• 2023

Related benchmarks

TaskDatasetResultRank
Multiple Object TrackingMOT17 (test)
MOTA73.7
921
Multi-Object TrackingBDD100K (val)
mIDF152.9
70
Multi-Object TrackingBDD100K (test)
Mean IDF156.5
36
Showing 3 of 3 rows

Other info

Follow for update