Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Image-based localization using LSTMs for structured feature correlation

About

In this work we propose a new CNN+LSTM architecture for camera pose regression for indoor and outdoor scenes. CNNs allow us to learn suitable feature representations for localization that are robust against motion blur and illumination changes. We make use of LSTM units on the CNN output, which play the role of a structured dimensionality reduction on the feature vector, leading to drastic improvements in localization performance. We provide extensive quantitative comparison of CNN-based and SIFT-based localization methods, showing the weaknesses and strengths of each. Furthermore, we present a new large-scale indoor dataset with accurate ground truth from a laser scanner. Experimental results on both indoor and outdoor public datasets show our method outperforms existing deep architectures, and can localize images in hard conditions, e.g., in the presence of mostly textureless surfaces, where classic SIFT-based methods fail.

Florian Walch, Caner Hazirbas, Laura Leal-Taix\'e, Torsten Sattler, Sebastian Hilsenbeck, Daniel Cremers• 2016

Related benchmarks

TaskDatasetResultRank
Camera Localization7 Scenes
Average Position Error (m)0.31
46
Visual Localization7Scenes (test)
Chess Median Angular Error (°)5.77
41
Camera Localization7-Scenes Chess
Translation Error (m)0.24
40
Visual LocalizationCambridge Landmarks (test)
Avg Median Positional Error (m)1.3
35
Camera Relocalization7-Scenes (test)
Median Translation Error (cm)31
30
Camera Pose Regression7Scenes Fire
Median Position Error (m)0.34
26
Camera Pose Regression7Scenes Heads
Median Position Error (m)0.21
26
Camera Pose Regression7Scenes Pumpkin
Median Position Error (m)0.33
26
Camera Pose Regression7Scenes
Median Position Error (m)0.31
26
Camera Pose Regression7Scenes (Office)
Median Position Error (m)0.3
26
Showing 10 of 33 rows

Other info

Follow for update