SPVLoc: Semantic Panoramic Viewport Matching for 6D Camera Localization in Unseen Environments

About

In this paper, we present SPVLoc, a global indoor localization method that accurately determines the six-dimensional (6D) camera pose of a query image and requires minimal scene-specific prior knowledge and no scene-specific training. Our approach employs a novel matching procedure to localize the perspective camera's viewport, given as an RGB image, within a set of panoramic semantic layout representations of the indoor environment. The panoramas are rendered from an untextured 3D reference model, which only comprises approximate structural information about room shapes, along with door and window annotations. We demonstrate that a straightforward convolutional network structure can successfully achieve image-to-panorama and ultimately image-to-model matching. Through a viewport classification score, we rank reference panoramas and select the best match for the query image. Then, a 6D relative pose is estimated between the chosen panorama and query image. Our experiments demonstrate that this approach not only efficiently bridges the domain gap but also generalizes well to previously unseen scenes that are not part of the training data. Moreover, it achieves superior localization accuracy compared to the state of the art methods and also estimates more degrees of freedom of the camera pose. Our source code is publicly available at https://fraunhoferhhi.github.io/spvloc .

Niklas Gard, Anna Hilsmann, Peter Eisert• 2024

Related benchmarks

Task	Dataset	Result
6D camera localization	ZInD	Median Translation Error (<1m) (cm)13.63	9
6D camera localization	Structured3D Furnishing-Level: Full	Median Translation Error (<1m)9.17	9
Perspective 120° FoV image-to-map localization	Structured3D Furnishing-Level: Full	Recall @ 10cm30.55	6

Showing 3 of 3 rows

Other info

Code

Follow for update

@wizwand_team Discord