CAMELTrack: Context-Aware Multi-cue ExpLoitation for Online Multi-Object Tracking
About
Online multi-object tracking has been recently dominated by tracking-by-detection (TbD) methods, where recent advances rely on increasingly sophisticated heuristics for tracklet representation, feature fusion, and multi-stage matching. The key strength of TbD lies in its modular design, enabling the integration of specialized off-the-shelf models like motion predictors and re-identification. However, the extensive usage of human-crafted rules for temporal associations makes these methods inherently limited in their ability to capture the complex interplay between various tracking cues. In this work, we introduce CAMEL, a novel association module for Context-Aware Multi-Cue ExpLoitation, that learns resilient association strategies directly from data, breaking free from hand-crafted heuristics while maintaining TbD's valuable modularity. At its core, CAMEL employs two transformer-based modules and relies on a novel association-centric training scheme to effectively model the complex interactions between tracked targets and their various association cues. Unlike end-to-end detection-by-tracking approaches, our method remains lightweight and fast to train while being able to leverage external off-the-shelf models. Our proposed online tracking pipeline, CAMELTrack, achieves state-of-the-art performance on multiple tracking benchmarks. Our code is available at https://github.com/TrackingLaboratory/CAMELTrack.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Multiple Object Tracking | MOT17 (test) | MOTA78.5 | 921 | |
| Multi-Object Tracking | DanceTrack (test) | HOTA0.693 | 355 | |
| Multi-Object Tracking | SportsMOT (test) | HOTA80.4 | 199 | |
| Multi-Object Tracking | SoccerNet (test) | HOTA54.2 | 23 | |
| Multiple Object Tracking | PoseTrack21 (val) | MOTA67.5 | 13 | |
| Multi-Object Tracking | BEE24 (test) | HOTA50.3 | 11 |