Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

G-CUT3R: Guided 3D Reconstruction with Camera and Depth Prior Integration

About

We introduce G-CUT3R, a novel feed-forward approach for guided 3D scene reconstruction that enhances the CUT3R model by integrating prior information. Unlike existing feed-forward methods that rely solely on input images, our method leverages auxiliary data, such as depth, camera calibrations, or camera positions, commonly available in real-world scenarios. We propose a lightweight modification to CUT3R, incorporating a dedicated encoder for each modality to extract features, which are fused with RGB image tokens via zero convolution. This flexible design enables seamless integration of any combination of prior information during inference. Evaluated across multiple benchmarks, including 3D reconstruction and other multi-view tasks, our approach demonstrates significant performance improvements, showing its ability to effectively utilize available priors while maintaining compatibility with varying input modalities.

Ramil Khafizov, Artem Komarichev, Ruslan Rakhimov, Peter Wonka, Evgeny Burnaev• 2025

Related benchmarks

TaskDatasetResultRank
Visual SLAMVirtual KITTI Sequence 01
RMSE ATE (Clone) (m)43.3
5
Visual SLAMVirtual KITTI Sequence 02
Clone RMSE ATE (m)23.77
5
Visual SLAMVirtual KITTI Sequence 06
Clone RMSE ATE (m)0.84
5
Visual SLAMVirtual KITTI Sequence 18
ATE RMSE (Clone)19.44
5
Visual SLAMVirtual KITTI Sequence 20
RMSE ATE (Clone) (m)129.5
5
Showing 5 of 5 rows

Other info

Follow for update