Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

OVTrack: Open-Vocabulary Multiple Object Tracking

About

The ability to recognize, localize and track dynamic objects in a scene is fundamental to many real-world applications, such as self-driving and robotic systems. Yet, traditional multiple object tracking (MOT) benchmarks rely only on a few object categories that hardly represent the multitude of possible objects that are encountered in the real world. This leaves contemporary MOT methods limited to a small set of pre-defined object categories. In this paper, we address this limitation by tackling a novel task, open-vocabulary MOT, that aims to evaluate tracking beyond pre-defined training categories. We further develop OVTrack, an open-vocabulary tracker that is capable of tracking arbitrary object classes. Its design is based on two key ingredients: First, leveraging vision-language models for both classification and association via knowledge distillation; second, a data hallucination strategy for robust appearance feature learning from denoising diffusion probabilistic models. The result is an extremely data-efficient open-vocabulary tracker that sets a new state-of-the-art on the large-scale, large-vocabulary TAO benchmark, while being trained solely on static images. Project page: https://www.vis.xyz/pub/ovtrack/

Siyuan Li, Tobias Fischer, Lei Ke, Henghui Ding, Martin Danelljan, Fisher Yu• 2023

Related benchmarks

TaskDatasetResultRank
Multi-Object TrackingBDD100K (val)--
70
Multi-Object TrackingTAO (val)
AssocA36.7
40
Generic Multiple Object TrackingRefer-GMOT40
MOTA27.78
26
Object TrackingTAO
TETA34.7
22
Multi-Object TrackingTAO 1.0 (val)
Base TETA36.3
14
Multi-Object TrackingTAO (test)--
13
Multi-Object TrackingTAO 1.0 (test)
Base TETA34.8
8
Closed-set MOT Track mAP comparisonTAO 1.0 (val)
Track mAP500.212
8
Multi-Object TrackingTAO Base classes
TETA35.5
6
Multi-Object TrackingTAO (Novel classes)
TETA27.8
6
Showing 10 of 15 rows

Other info

Follow for update