Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

HiFT: Hierarchical Feature Transformer for Aerial Tracking

About

Most existing Siamese-based tracking methods execute the classification and regression of the target object based on the similarity maps. However, they either employ a single map from the last convolutional layer which degrades the localization accuracy in complex scenarios or separately use multiple maps for decision making, introducing intractable computations for aerial mobile platforms. Thus, in this work, we propose an efficient and effective hierarchical feature transformer (HiFT) for aerial tracking. Hierarchical similarity maps generated by multi-level convolutional layers are fed into the feature transformer to achieve the interactive fusion of spatial (shallow layers) and semantics cues (deep layers). Consequently, not only the global contextual information can be raised, facilitating the target search, but also our end-to-end architecture with the transformer can efficiently learn the interdependencies among multi-level features, thereby discovering a tracking-tailored feature space with strong discriminability. Comprehensive evaluations on four aerial benchmarks have proven the effectiveness of HiFT. Real-world tests on the aerial platform have strongly validated its practicability with a real-time speed. Our code is available at https://github.com/vision4robotics/HiFT.

Ziang Cao, Changhong Fu, Junjie Ye, Bowen Li, Yiming Li• 2021

Related benchmarks

TaskDatasetResultRank
Visual Object TrackingUAV123 (test)--
188
Visual TrackingUAV123--
41
UAV TrackingUAVDT
Precision78.7
32
UAV TrackingDTB70
Precision0.802
32
UAV TrackingVisDrone 2018
Precision74
32
Visual Object TrackingUAV123
SUC59
25
Visual Object TrackingDTB70 (test)
AUC59.4
19
Visual Object TrackingUAVDT (test)
AUC47.5
19
Visual Object TrackingUAVTrack112 L (test)
AUC (%)55.1
19
Visual Object TrackingUAVTrack112 (test)
AUC57
19
Showing 10 of 17 rows

Other info

Follow for update