NRGS-SLAM: Monocular Non-Rigid SLAM for Endoscopy via Deformation-Aware 3D Gaussian Splatting

About

Visual simultaneous localization and mapping (V-SLAM) is a fundamental capability for autonomous perception and navigation. However, endoscopic scenes violate the rigidity assumption due to persistent soft-tissue deformations, creating a strong coupling ambiguity between camera ego-motion and intrinsic deformation. Although recent monocular non-rigid SLAM methods have made notable progress, they often lack effective decoupling mechanisms and rely on sparse or low-fidelity scene representations, which leads to tracking drift and limited reconstruction quality. To address these limitations, we propose NRGS-SLAM, a monocular non-rigid SLAM system for endoscopy based on 3D Gaussian Splatting. To resolve the coupling ambiguity, we introduce a deformation-aware 3D Gaussian map that augments each Gaussian primitive with a learnable deformation probability, optimized via a Bayesian self-supervision strategy without requiring external non-rigidity labels. Building on this representation, we design a deformable tracking module that performs robust coarse-to-fine pose estimation by prioritizing low-deformation regions, followed by efficient per-frame deformation updates. A carefully designed deformable mapping module progressively expands and refines the map, balancing representational capacity and computational efficiency. In addition, a unified robust geometric loss incorporates external geometric priors to mitigate the inherent ill-posedness of monocular non-rigid SLAM. Extensive experiments on multiple public endoscopic datasets demonstrate that NRGS-SLAM achieves more accurate camera pose estimation (up to 50\% reduction in RMSE) and higher-quality photo-realistic reconstructions than state-of-the-art methods. Comprehensive ablation studies further validate the effectiveness of our key design choices. Source code will be publicly available upon paper acceptance.

Jiwei Shan, Zeyu Cai, Yirui Li, Yongbo Chen, Lijun Han, Yun-hui Liu, Hesheng Wang, Shing Shin Cheng• 2026

Related benchmarks

Task	Dataset	Result
Camera Localization	StereoMIS (P2-2)	RMSE10.24	16
Camera Localization	StereoMIS (P2-4)	RMSE9.45	16
Camera Localization	StereoMIS Average	RMSE6.78	16
Camera Localization	StereoMIS (P2-3)	RMSE0.003	16
Camera Localization	StereoMIS (P2-5)	RMSE7.41	14
Camera Localization	C3VD c1_descending_t4_v4 v2	RMSE6.81	9
Camera Localization	C3VD v2 (c2_transverse1_t1_v4)	RMSE10.47	9
Camera Localization	C3VD Average v2	RMSE8.13	9
Camera Localization	C3VD c1_sigmoid2_t4_v4 v2	RMSE7.26	9
Camera Localization	C3VD c1_sigmoid1_t4_v4 v2	RMSE7.96	8

Showing 10 of 21 rows

Other info

Follow for update

@wizwand_team Discord