Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

BGG: Bridging the Geometric Gap between Cross-View images by Vision Foundation Model Adaptation for Geo-Localization

About

Geometric differences between cross-view images, such as drone and satellite views, significantly increase the challenge of Cross-View Geo-Localization (CVGL), which aims to acquire the geolocation of images by image retrieval. To further enhance the CVGL performance, this paper proposes a parameter-efficient adaptation framework for bridging the geometric gap across images based on the vision foundation model (VFM) (e.g., DINOv3), termed BGG. BGG not only effectively leverages the general visual representations of VFM and captures the robust and consistent features from cross-view images, but also utilizes the generalization capabilities of the VFM, significantly improving the CVGL performance. It mainly contains a Multi-granularity Feature Enhancement Adapter (MFEA) and a Frequency-Aware Structural Aggregation (FASA) module. Specifically, MFEA enhances the scale adaptability and viewpoint robustness of features by multi-level dilated convolutions, effectively bridging the cross-view geometric gap with small training costs. Additionally, considering the [CLS] token lacks spatial details for precise image retrieval and localization, the FASA module modulates patch tokens in the frequency domain and performs adaptive aggregation for local structural feature enhancement. Finally, BGG fuses the enhanced local features with the [CLS] token for more accurate CVGL. Extensive experiments on University-1652 and SUES-200 datasets demonstrate that BGG has significant advantages over other methods and achieves state-of-the-art localization performance with low training costs.

Wei Wang, Dou Quan, Ning Huyan, Shuang Wang, Yi Li, Pei He, Licheng Jiao• 2026

Related benchmarks

TaskDatasetResultRank
Cross-view geo-localizationUniversity-1652 Drone -> Satellite
R@196.24
149
Cross-view geo-localizationUniversity-1652 Satellite -> Drone
R@197.57
112
Drone-to-Satellite Cross-view Geo-localizationSUES-200 150m
R@199.3
74
Drone-to-Satellite Cross-view Geo-localizationSUES-200 250m
R@199.53
49
Cross-view Geo-localization (Drone to Satellite)SUES-200 300m altitude
R@199.25
48
Cross-view Geo-localization (Satellite to Drone)SUES-200 300m altitude
R@198.75
47
Cross-view geo-localizationSUES-200 Satellite→Drone (200m)
R@198.75
41
Cross-view Geo-localization (Satellite to Drone)SUES-200 250m altitude
R@198.75
38
Drone-to-Satellite Cross-view Geo-localizationSUES-200 200m
Recall@199.45
37
Cross-view geo-localizationSUES-200 Satellite→Drone 150m
Recall@198.75
30
Showing 10 of 16 rows

Other info

Follow for update