Beyond Correlation Filters: Learning Continuous Convolution Operators for Visual Tracking
About
Discriminative Correlation Filters (DCF) have demonstrated excellent performance for visual object tracking. The key to their success is the ability to efficiently exploit available negative data by including all shifted versions of a training sample. However, the underlying DCF formulation is restricted to single-resolution feature maps, significantly limiting its potential. In this paper, we go beyond the conventional DCF framework and introduce a novel formulation for training continuous convolution filters. We employ an implicit interpolation model to pose the learning problem in the continuous spatial domain. Our proposed formulation enables efficient integration of multi-resolution deep feature maps, leading to superior results on three object tracking benchmarks: OTB-2015 (+5.1% in mean OP), Temple-Color (+4.6% in mean OP), and VOT2015 (20% relative reduction in failure rate). Additionally, our approach is capable of sub-pixel localization, crucial for the task of accurate feature point tracking. We also demonstrate the effectiveness of our learning formulation in extensive feature point tracking experiments. Code and supplementary material are available at http://www.cvl.isy.liu.se/research/objrec/visualtracking/conttrack/index.html.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Visual Object Tracking | GOT-10k (test) | Average Overlap32.5 | 378 | |
| Visual Object Tracking | UAV123 (test) | AUC51.3 | 188 | |
| Visual Object Tracking | UAV123 | AUC0.577 | 165 | |
| Visual Object Tracking | OTB-100 | AUC68.2 | 136 | |
| Visual Object Tracking | NfS | AUC0.488 | 112 | |
| Visual Object Tracking | VOT 2016 | EAO33.1 | 79 | |
| Visual Tracking | VOT 2016 (test) | EAO0.331 | 70 | |
| Visual Object Tracking | VOT 2015 | EAO0.303 | 61 | |
| Visual Object Tracking | NFS (Need for Speed) 30 FPS (test) | AUC48.8 | 54 | |
| Visual Object Tracking | GOT-10k 1.0 (test) | AO32.5 | 51 |