Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

PLGSLAM: Progressive Neural Scene Represenation with Local to Global Bundle Adjustment

About

Neural implicit scene representations have recently shown encouraging results in dense visual SLAM. However, existing methods produce low-quality scene reconstruction and low-accuracy localization performance when scaling up to large indoor scenes and long sequences. These limitations are mainly due to their single, global radiance field with finite capacity, which does not adapt to large scenarios. Their end-to-end pose networks are also not robust enough with the growth of cumulative errors in large scenes. To this end, we introduce PLGSLAM, a neural visual SLAM system capable of high-fidelity surface reconstruction and robust camera tracking in real-time. To handle large-scale indoor scenes, PLGSLAM proposes a progressive scene representation method which dynamically allocates new local scene representation trained with frames within a local sliding window. This allows us to scale up to larger indoor scenes and improves robustness (even under pose drifts). In local scene representation, PLGSLAM utilizes tri-planes for local high-frequency features with multi-layer perceptron (MLP) networks for the low-frequency feature, achieving smoothness and scene completion in unobserved areas. Moreover, we propose local-to-global bundle adjustment method with a global keyframe database to address the increased pose drifts on long sequences. Experimental results demonstrate that PLGSLAM achieves state-of-the-art scene reconstruction results and tracking performance across various datasets and scenarios (both in small and large-scale indoor environments). The code is open-sourced at https://github.com/dtc111111/plgslam.

Tianchen Deng, Guole Shen, Tong Qin, Jianyu Wang, Wentao Zhao, Jingchuan Wang, Danwei Wang, Weidong Chen• 2023

Related benchmarks

TaskDatasetResultRank
LocalizationReplica (8 scenes average)
ATE Mean (cm)0.525
12
Surface ReconstructionReplica small room scenes average
Depth L1 (cm)0.771
6
Surface ReconstructionApartment dataset
Depth L1 (cm)6.033
6
Pose EstimationApartment dataset
ATE Mean (cm)5.574
6
LocalizationScanNet
ATE Mean (cm)6.12
4
ReconstructionScanNet
Accuracy (cm)19.42
4
SLAMReplica
Frame Processing Time (s)0.14
4
SLAMScanNet
Speed FPT (s)0.37
4
Showing 8 of 8 rows

Other info

Code

Follow for update