Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Map-free Visual Relocalization: Metric Pose Relative to a Single Image

About

Can we relocalize in a scene represented by a single reference image? Standard visual relocalization requires hundreds of images and scale calibration to build a scene-specific 3D map. In contrast, we propose Map-free Relocalization, i.e., using only one photo of a scene to enable instant, metric scaled relocalization. Existing datasets are not suitable to benchmark map-free relocalization, due to their focus on large scenes or their limited variability. Thus, we have constructed a new dataset of 655 small places of interest, such as sculptures, murals and fountains, collected worldwide. Each place comes with a reference image to serve as a relocalization anchor, and dozens of query images with known, metric camera poses. The dataset features changing conditions, stark viewpoint changes, high variability across places, and queries with low to no visual overlap with the reference image. We identify two viable families of existing methods to provide baseline results: relative pose regression, and feature matching combined with single-image depth prediction. While these methods show reasonable performance on some favorable scenes in our dataset, map-free relocalization proves to be a challenge that requires new, innovative solutions.

Eduardo Arnold, Jamie Wynn, Sara Vicente, Guillermo Garcia-Hernando, \'Aron Monszpart, Victor Adrian Prisacariu, Daniyar Turmukhambetov, Eric Brachmann• 2022

Related benchmarks

TaskDatasetResultRank
Relative Pose EstimationMegaDepth 1500
AUC @ 20°10
151
Relative Pose EstimationScanNet 1500 pairs (test)
AUC@5°1.84
56
Camera Localization7 Scenes
Average Position Error (m)0.13
46
Visual LocalizationCambridge Landmarks (test)
Avg Median Positional Error (m)2.51
35
Visual LocalizationCambridge Landmarks College
Median Translation Error (m)2.44
35
Visual LocalizationCambridge Landmarks Church
Median Translation Error (m)2.91
35
Camera pose estimationACID
AUC @ 5°0.0257
30
Pose Estimation7 Scenes
Average Median Translation Error (m)0.12
29
Visual LocalizationCambridge Landmarks Hospital
Median Translation Error (m)3.73
26
Relative Camera Pose EvaluationScanNet1500
AUC@51.84
23
Showing 10 of 20 rows

Other info

Follow for update