Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Scene-agnostic Pose Regression for Visual Localization

About

Absolute Pose Regression (APR) predicts 6D camera poses but lacks the adaptability to unknown environments without retraining, while Relative Pose Regression (RPR) generalizes better yet requires a large image retrieval database. Visual Odometry (VO) generalizes well in unseen environments but suffers from accumulated error in open trajectories. To address this dilemma, we introduce a new task, Scene-agnostic Pose Regression (SPR), which can achieve accurate pose regression in a flexible way while eliminating the need for retraining or databases. To benchmark SPR, we created a large-scale dataset, 360SPR, with over 200K photorealistic panoramas, 3.6M pinhole images and camera poses in 270 scenes at three different sensor heights. Furthermore, a SPR-Mamba model is initially proposed to address SPR in a dual-branch manner. Extensive experiments and studies demonstrate the effectiveness of our SPR paradigm, dataset, and model. In the unknown scenes of both 360SPR and 360Loc datasets, our method consistently outperforms APR, RPR and VO. The dataset and code are available at https://junweizheng93.github.io/publications/SPR/SPR.html.

Junwei Zheng, Ruiping Liu, Yufan Chen, Zhenfang Chen, Kailun Yang, Jiaming Zhang, Rainer Stiefelhagen• 2025

Related benchmarks

TaskDatasetResultRank
Visual Localization360SPR Pinhole (unseen)
TE (m)3.78
14
Scene Pose Regression360SPR 1.0 (unseen)
Median Translation Error (m)3.85
13
Visual Localization360Loc cross-validation (unseen)
Median Translation Error (m)1.94
13
Scene Pose Regression360SPR 1.0 (seen)
Median Translation Error (m)3.32
13
Visual Localization360Loc official (seen)
Median Translation Error (m)1.43
13
Visual Localization7Scenes Pinhole (unseen environments)
Translation Error (m)0.4
7
Visual Localization360SPR (seen)
Median Translation Error (m)3.32
7
Showing 7 of 7 rows

Other info

Follow for update