Real-Time MDNet
About
We present a fast and accurate visual tracking algorithm based on the multi-domain convolutional neural network (MDNet). The proposed approach accelerates feature extraction procedure and learns more discriminative models for instance classification; it enhances representation quality of target and background by maintaining a high resolution feature map with a large receptive field per activation. We also introduce a novel loss term to differentiate foreground instances across multiple domains and learn a more discriminative embedding of target objects with similar semantics. The proposed techniques are integrated into the pipeline of a well known CNN-based visual tracking algorithm, MDNet. We accomplish approximately 25 times speed-up with almost identical accuracy compared to MDNet. Our algorithm is evaluated in multiple popular tracking benchmark datasets including OTB2015, UAV123, and TempleColor, and outperforms the state-of-the-art real-time tracking methods consistently even without dataset-specific parameter tuning.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Visual Object Tracking | TrackingNet (test) | Normalized Precision (Pnorm)69.4 | 460 | |
| Visual Object Tracking | LaSOT (test) | AUC39.7 | 444 | |
| Visual Object Tracking | UAV123 (test) | AUC52.8 | 188 | |
| Visual Object Tracking | UAV123 | AUC0.528 | 165 | |
| Visual Object Tracking | NfS | AUC0.433 | 112 | |
| Visual Object Tracking | OTB 2015 | AUC65 | 58 | |
| RGBT Tracking | RGBT-210 | Precision Rate71.5 | 54 | |
| RGBT Tracking | RGBT 234 | Precision Rate75.8 | 53 | |
| Visual Tracking | NfS (test) | AUC43.3 | 45 | |
| RGBT Tracking | VOT-RGBT 2019 | EAO21.36 | 40 |