Learning to Localize in New Environments from Synthetic Training Data

About

Most existing approaches for visual localization either need a detailed 3D model of the environment or, in the case of learning-based methods, must be retrained for each new scene. This can either be very expensive or simply impossible for large, unknown environments, for example in search-and-rescue scenarios. Although there are learning-based approaches that operate scene-agnostically, the generalization capability of these methods is still outperformed by classical approaches. In this paper, we present an approach that can generalize to new scenes by applying specific changes to the model architecture, including an extended regression part, the use of hierarchical correlation layers, and the exploitation of scale and uncertainty information. Our approach outperforms the 5-point algorithm using SIFT features on equally big images and additionally surpasses all previous learning-based approaches that were trained on different data. It is also superior to most of the approaches that were specifically trained on the respective scenes. We also evaluate our approach in a scenario where only very few reference images are available, showing that under such more realistic conditions our learning-based approach considerably exceeds both existing learning-based and classical methods.

Dominik Winkelbauer, Maximilian Denninger, Rudolph Triebel• 2020

Related benchmarks

Task	Dataset	Result
Relative Pose Estimation	ScanNet 1500 pairs (test)	AUC@5°2.3	56
Camera Localization	7 Scenes	Average Position Error (m)0.08	46
Visual Localization	Cambridge Landmarks (test)	Avg Median Positional Error (m)2.22	35
Visual Localization	Cambridge Landmarks College	Median Translation Error (m)2.33	35
Visual Localization	Cambridge Landmarks Church	Median Translation Error (m)2.3	35
Visual Localization	7 Scenes	Chess Median Translation Error (cm)5	33
Camera pose estimation	ACID	AUC @ 5°0.0414	30
Pose Estimation	7 Scenes	Average Median Translation Error (m)0.07	29
Visual Localization	Cambridge Landmarks Hospital	Median Translation Error (m)3.54	26
Visual Localization	Cambridge Landmarks	College: Median Translation Error (cm)233	25

Showing 10 of 20 rows

Other info

Follow for update

@wizwand_team Discord