Context-aware Deep Feature Compression for High-speed Visual Tracking
About
We propose a new context-aware correlation filter based tracking framework to achieve both high computational speed and state-of-the-art performance among real-time trackers. The major contribution to the high computational speed lies in the proposed deep feature compression that is achieved by a context-aware scheme utilizing multiple expert auto-encoders; a context in our framework refers to the coarse category of the tracking target according to appearance patterns. In the pre-training phase, one expert auto-encoder is trained per category. In the tracking phase, the best expert auto-encoder is selected for a given target, and only this auto-encoder is used. To achieve high tracking performance with the compressed feature map, we introduce extrinsic denoising processes and a new orthogonality loss term for pre-training and fine-tuning of the expert auto-encoders. We validate the proposed context-aware framework through a number of experiments, where our method achieves a comparable performance to state-of-the-art trackers which cannot run in real-time, while running at a significantly fast speed of over 100 fps.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Object Tracking | OTB 2015 (test) | AUC0.602 | 63 | |
| Visual Object Tracking | OTB 2013 | AUC65.2 | 60 | |
| Visual Object Tracking | CVPR 2013 (test) | Precision Score89.8 | 23 | |
| Visual Object Tracking | OTB 2013 (test) | AUC65.2 | 16 |