Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Learning to Localize in New Environments from Synthetic Training Data

About

Most existing approaches for visual localization either need a detailed 3D model of the environment or, in the case of learning-based methods, must be retrained for each new scene. This can either be very expensive or simply impossible for large, unknown environments, for example in search-and-rescue scenarios. Although there are learning-based approaches that operate scene-agnostically, the generalization capability of these methods is still outperformed by classical approaches. In this paper, we present an approach that can generalize to new scenes by applying specific changes to the model architecture, including an extended regression part, the use of hierarchical correlation layers, and the exploitation of scale and uncertainty information. Our approach outperforms the 5-point algorithm using SIFT features on equally big images and additionally surpasses all previous learning-based approaches that were trained on different data. It is also superior to most of the approaches that were specifically trained on the respective scenes. We also evaluate our approach in a scenario where only very few reference images are available, showing that under such more realistic conditions our learning-based approach considerably exceeds both existing learning-based and classical methods.

Dominik Winkelbauer, Maximilian Denninger, Rudolph Triebel• 2020

Related benchmarks

TaskDatasetResultRank
Relative Pose EstimationScanNet 1500 pairs (test)
AUC@5°2.3
56
Camera Localization7 Scenes
Average Position Error (m)0.08
46
Visual LocalizationCambridge Landmarks (test)
Avg Median Positional Error (m)2.22
35
Visual LocalizationCambridge Landmarks College
Median Translation Error (m)2.33
35
Visual LocalizationCambridge Landmarks Church
Median Translation Error (m)2.3
35
Camera pose estimationACID
AUC @ 5°0.0414
30
Pose Estimation7 Scenes
Average Median Translation Error (m)0.07
29
Visual LocalizationCambridge Landmarks Hospital
Median Translation Error (m)3.54
26
Relative Camera Pose EvaluationScanNet1500
AUC@52.3
23
Visual LocalizationCambridge Landmarks ShopFacade
Median Translation Error0.72
21
Showing 10 of 16 rows

Other info

Follow for update