End-to-end representation learning for Correlation Filter based tracking
About
The Correlation Filter is an algorithm that trains a linear template to discriminate between images and their translations. It is well suited to object tracking because its formulation in the Fourier domain provides a fast solution, enabling the detector to be re-trained once per frame. Previous works that use the Correlation Filter, however, have adopted features that were either manually designed or trained for a different task. This work is the first to overcome this limitation by interpreting the Correlation Filter learner, which has a closed-form solution, as a differentiable layer in a deep neural network. This enables learning deep features that are tightly coupled to the Correlation Filter. Experiments illustrate that our method has the important practical benefit of allowing lightweight architectures to achieve state-of-the-art performance at high framerates.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Visual Object Tracking | TrackingNet (test) | Normalized Precision (Pnorm)65.4 | 460 | |
| Visual Object Tracking | GOT-10k (test) | Average Overlap37.4 | 378 | |
| Object Tracking | TrackingNet | Precision (P)57.8 | 225 | |
| Visual Object Tracking | UAV123 (test) | AUC43.6 | 188 | |
| Visual Object Tracking | OTB-100 | AUC56.8 | 136 | |
| Object Tracking | OTB 2015 (test) | AUC0.568 | 63 | |
| Visual Object Tracking | OTB 2013 | AUC61 | 60 | |
| Visual Object Tracking | OTB 2015 | AUC62 | 58 | |
| RGBT Tracking | RGBT 234 | Precision Rate55.1 | 53 | |
| Visual Object Tracking | GOT-10k 1.0 (test) | AO37.4 | 51 |