Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Around the World in 80 Timesteps: A Generative Approach to Global Visual Geolocation

About

Global visual geolocation predicts where an image was captured on Earth. Since images vary in how precisely they can be localized, this task inherently involves a significant degree of ambiguity. However, existing approaches are deterministic and overlook this aspect. In this paper, we aim to close the gap between traditional geolocalization and modern generative methods. We propose the first generative geolocation approach based on diffusion and Riemannian flow matching, where the denoising process operates directly on the Earth's surface. Our model achieves state-of-the-art performance on three visual geolocation benchmarks: OpenStreetView-5M, YFCC-100M, and iNat21. In addition, we introduce the task of probabilistic visual geolocation, where the model predicts a probability distribution over all possible locations instead of a single point. We introduce new metrics and baselines for this task, demonstrating the advantages of our diffusion-based approach. Codes and models will be made available.

Nicolas Dufour, David Picard, Vicky Kalogeiton, Loic Landrieu• 2024

Related benchmarks

TaskDatasetResultRank
Image GeolocalizationIM2GPS3K (test)
Success Rate (25km)36.8
122
Image GeolocalizationYFCC4K (test)
Success Rate (Region, 200km)45.3
86
Image GeolocalizationYFCC4k
Success Rate (1km)6.6
30
Image GeolocalizationIM2GPS
Success Rate @ 25 km (City)44.7
26
Visual GeolocationOSV-5M (test)
Accuracy (Country)76.2
20
Image GeolocalizationYFCC26k
Success Rate @ 1 km (Street)5.3
14
GeolocationAVG (test)
City Acc (25km)6.2
10
Image GeolocationCCL-Bench
Accuracy @ 1km7.33
8
Image GeolocationCCL-Bench
City ACC15
8
Next-Chapter PredictionNext-Chapter Prediction (NCP) (test)
Contra Improve0.083
7
Showing 10 of 19 rows

Other info

Follow for update