VFM-Loc: Zero-Shot Cross-View Geo-Localization via Aligning Discriminative Visual Hierarchies
About
Cross-View Geo-Localization (CVGL) in remote sensing aims to locate a drone-view query by matching it to geo-tagged satellite images. Although supervised methods have achieved strong results on closeset benchmarks, they often fail to generalize to unconstrained, real-world scenarios due to severe viewpoint differences and dataset bias. To overcome these limitations, we present VFM-Loc, a training-free framework for zero-shot CVGL that leverages the generalizable visual representations from vision foundational models (VFMs). VFM-Loc identifies and matches discriminative visual clues across different viewpoints through a progressive alignment strategy. First, we design a hierarchical clue extraction mechanism using Generalized Mean pooling and Scale-Weighted RMAC to preserve distinctive visual clues across scales while maintaining hierarchical confidence. Second, we introduce a statistical manifold alignment pipeline based on domain-wise PCA and Orthogonal Procrustes analysis, linearly aligning heterogeneous feature distributions in a shared metric space. Experiments demonstrate that VFM-Loc exhibits strong zero-shot accuracy on standard benchmarks and surpasses supervised methods by over 20% in Recall@1 on the challenging LO-UCV dataset with large oblique angles. This work highlights that principled alignment of pre-trained features can effectively bridge the cross-view gap, establishing a robust and training-free paradigm for real-world CVGL. The relevant code is made available at: https://github.com/DingLei14/VFM-Loc.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Cross-view geo-localization | University-1652 Drone -> Satellite | R@176.36 | 94 | |
| Cross-view geo-localization | University-1652 Satellite -> Drone | R@192.58 | 81 | |
| Satellite→Drone Geo-localization | SUES-200 250m | R@199.63 | 36 | |
| Satellite→Drone Geo-localization | SUES-200 200m | R@198.75 | 36 | |
| Drone-to-Satellite Cross-view Geo-localization | SUES-200 150m | R@199.62 | 25 | |
| Drone-to-Satellite cross-view geolocalization | LO-UCV | Recall@189.84 | 14 | |
| Satellite-to-Drone cross-view geolocalization | LO-UCV | Recall@195.31 | 14 | |
| Satellite-to-Drone Geo-localization | SUES-200 altitude (150m) | R@198.75 | 13 | |
| Cross-view Geo-localization (Drone to Satellite) | SUES 200m altitude | R@199.52 | 8 | |
| Cross-view Geo-localization (Drone to Satellite) | SUES-200 300m altitude | R@199.6 | 8 |