Learning Discriminative Model Prediction for Tracking
About
The current strive towards end-to-end trainable computer vision systems imposes major challenges for the task of visual tracking. In contrast to most other vision problems, tracking requires the learning of a robust target-specific appearance model online, during the inference stage. To be end-to-end trainable, the online learning of the target model thus needs to be embedded in the tracking architecture itself. Due to the imposed challenges, the popular Siamese paradigm simply predicts a target feature template, while ignoring the background appearance information during inference. Consequently, the predicted model possesses limited target-background discriminability. We develop an end-to-end tracking architecture, capable of fully exploiting both target and background appearance information for target model prediction. Our architecture is derived from a discriminative learning loss by designing a dedicated optimization process that is capable of predicting a powerful model in only a few iterations. Furthermore, our approach is able to learn key aspects of the discriminative loss itself. The proposed tracker sets a new state-of-the-art on 6 tracking benchmarks, achieving an EAO score of 0.440 on VOT2018, while running at over 40 FPS. The code and models are available at https://github.com/visionml/pytracking.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Video Object Segmentation | DAVIS 2017 (val) | J mean60.1 | 1130 | |
| Visual Object Tracking | TrackingNet (test) | Normalized Precision (Pnorm)80.1 | 460 | |
| Visual Object Tracking | LaSOT (test) | AUC57.7 | 444 | |
| Visual Object Tracking | GOT-10k (test) | Average Overlap61.1 | 378 | |
| Object Tracking | LaSoT | AUC56.9 | 333 | |
| Object Tracking | TrackingNet | Precision (P)70.6 | 225 | |
| Visual Object Tracking | GOT-10k | AO61.1 | 223 | |
| Visual Object Tracking | UAV123 (test) | AUC65.4 | 188 | |
| RGB-D Object Tracking | VOT-RGBD 2022 (public challenge) | EAO54.3 | 167 | |
| Visual Object Tracking | UAV123 | AUC0.654 | 165 |