Map-free Visual Relocalization: Metric Pose Relative to a Single Image

About

Can we relocalize in a scene represented by a single reference image? Standard visual relocalization requires hundreds of images and scale calibration to build a scene-specific 3D map. In contrast, we propose Map-free Relocalization, i.e., using only one photo of a scene to enable instant, metric scaled relocalization. Existing datasets are not suitable to benchmark map-free relocalization, due to their focus on large scenes or their limited variability. Thus, we have constructed a new dataset of 655 small places of interest, such as sculptures, murals and fountains, collected worldwide. Each place comes with a reference image to serve as a relocalization anchor, and dozens of query images with known, metric camera poses. The dataset features changing conditions, stark viewpoint changes, high variability across places, and queries with low to no visual overlap with the reference image. We identify two viable families of existing methods to provide baseline results: relative pose regression, and feature matching combined with single-image depth prediction. While these methods show reasonable performance on some favorable scenes in our dataset, map-free relocalization proves to be a challenge that requires new, innovative solutions.

Eduardo Arnold, Jamie Wynn, Sara Vicente, Guillermo Garcia-Hernando, \'Aron Monszpart, Victor Adrian Prisacariu, Daniyar Turmukhambetov, Eric Brachmann• 2022

Related benchmarks

Task	Dataset	Result
Relative Pose Estimation	MegaDepth 1500	AUC @ 20°10	163
Relative Pose Estimation	ScanNet 1500 pairs (test)	AUC@5°1.84	56
Camera Localization	7 Scenes	Average Position Error (m)0.13	46
Visual Localization	Cambridge Landmarks (test)	Avg Median Positional Error (m)2.51	35
Visual Localization	Cambridge Landmarks College	Median Translation Error (m)2.44	35
Visual Localization	Cambridge Landmarks Church	Median Translation Error (m)2.91	35
Camera pose estimation	ACID	AUC @ 5°0.0257	30
Pose Estimation	7 Scenes	Average Median Translation Error (m)0.12	29
Visual Localization	Cambridge Landmarks Hospital	Median Translation Error (m)3.73	26
Relative Camera Pose Evaluation	ScanNet1500	AUC@51.84	23

Showing 10 of 22 rows

Other info

Follow for update

@wizwand_team Discord