Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

GeoFlow: Real-Time Fine-Grained Cross-View Geolocalization via Iterative Flow Prediction

About

Accurate and fast localization is vital for safe autonomous navigation in GPS-denied areas. Fine-Grained Cross-View Geolocalization (FG-CVG) aims to estimate the precise 2-Degree-of-Freedom (2-DoF) location of a ground image relative to a satellite image. However, current methods force a difficult trade-off, with high-accuracy models being slow for real-time use. In this paper, we introduce GeoFlow, a new approach that offers a lightweight and highly efficient framework that breaks this accuracy-speed trade-off. Our technique learns a direct probabilistic mapping, predicting the displacement (in distance and direction) required to correct any given location hypothesis. This is complemented by our novel inference algorithm, Iterative Refinement Sampling (IRS). Instead of trusting a single prediction, IRS refines a population of hypotheses, allowing them to iteratively 'flow' from random starting points to a robust, converged consensus. Even its iterative nature, this approach offers flexible inference-time scaling, allowing a direct trade-off between performance and computation without any re-training. Experiments on the KITTI and VIGOR datasets show that GeoFlow achieves state-of-the-art efficiency, running at real-time speeds of 29 FPS while maintaining competitive localization accuracy. This work opens a new path for the development of practical real-time geolocalization systems.

Ayesh Abu Lehyeh, Xiaohan Zhang, Ahmad Arrabi, Waqas Sultani, Chen Chen, Safwan Wshah• 2026

Related benchmarks

TaskDatasetResultRank
Location and orientation estimationVIGOR (Cross-Area)
Location Mean Error (m)4.62
28
Location and orientation estimationVIGOR (Same-Area)
Location Mean Error (m)3.51
28
Position and Orientation EstimationKITTI Cross-area
Position Lateral Recall R@1m (%)36.36
13
Cross-View GeolocalizationKITTI Same-Area (test)
Lateral Recall @ 1m96.85
6
3-DoF Pose EstimationKITTI Same-area
Location Mean Error (m)1.03
2
3-DoF Pose EstimationKITTI Cross-area
Location Mean Error (m)8.53
2
Showing 6 of 6 rows

Other info

Follow for update