Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Image-based Localization using Hourglass Networks

About

In this paper, we propose an encoder-decoder convolutional neural network (CNN) architecture for estimating camera pose (orientation and location) from a single RGB-image. The architecture has a hourglass shape consisting of a chain of convolution and up-convolution layers followed by a regression part. The up-convolution layers are introduced to preserve the fine-grained information of the input image. Following the common practice, we train our model in end-to-end manner utilizing transfer learning from large scale classification data. The experiments demonstrate the performance of the approach on data exhibiting different lighting conditions, reflections, and motion blur. The results indicate a clear improvement over the previous state-of-the-art even when compared to methods that utilize sequence of test frames instead of a single frame.

Iaroslav Melekhov, Juha Ylioinas, Juho Kannala, Esa Rahtu• 2017

Related benchmarks

TaskDatasetResultRank
Camera Localization7 Scenes
Average Position Error (m)0.23
46
Visual Localization7Scenes (test)
Chess Median Angular Error (°)6.17
41
Camera Localization7-Scenes Chess
Translation Error (m)0.15
40
Camera Relocalization7-Scenes (test)
Median Translation Error (cm)23
30
Camera Relocalization7-Scenes dSLAM GT (test)
Median Translation Error (cm)23
16
Camera Localization7-Scenes Pumpkin
Translation Error (m)0.25
14
Camera Localization7-Scenes Stairs
Translation Error (m)0.29
14
Camera Localization7-Scenes Average
Translation Error (m)0.23
14
Camera Localization7-Scenes Fire
Translation Error (m)0.27
14
Camera Localization7-Scenes Office
Translation Error (m)0.21
14
Showing 10 of 13 rows

Other info

Follow for update