Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

MUSt3R: Multi-view Network for Stereo 3D Reconstruction

About

DUSt3R introduced a novel paradigm in geometric computer vision by proposing a model that can provide dense and unconstrained Stereo 3D Reconstruction of arbitrary image collections with no prior information about camera calibration nor viewpoint poses. Under the hood, however, DUSt3R processes image pairs, regressing local 3D reconstructions that need to be aligned in a global coordinate system. The number of pairs, growing quadratically, is an inherent limitation that becomes especially concerning for robust and fast optimization in the case of large image collections. In this paper, we propose an extension of DUSt3R from pairs to multiple views, that addresses all aforementioned concerns. Indeed, we propose a Multi-view Network for Stereo 3D Reconstruction, or MUSt3R, that modifies the DUSt3R architecture by making it symmetric and extending it to directly predict 3D structure for all views in a common coordinate frame. Second, we entail the model with a multi-layer memory mechanism which allows to reduce the computational complexity and to scale the reconstruction to large collections, inferring thousands of 3D pointmaps at high frame-rates with limited added complexity. The framework is designed to perform 3D reconstruction both offline and online, and hence can be seamlessly applied to SfM and visual SLAM scenarios showing state-of-the-art performance on various 3D downstream tasks, including uncalibrated Visual Odometry, relative camera pose, scale and focal estimation, 3D reconstruction and multi-view depth estimation.

Yohann Cabon, Lucas Stoffl, Leonid Antsfeld, Gabriela Csurka, Boris Chidlovskii, Jerome Revaud, Vincent Leroy• 2025

Related benchmarks

TaskDatasetResultRank
3D ReconstructionDTU
Accuracy Median1.863
47
3D ReconstructionNeural RGB-D (NRGBD)
Acc Mean0.062
38
Visual OdometryTUM-RGBD
freiburg1/xyz Error0.2
34
3D Reconstruction7 Scenes
Accuracy Mean2.8
32
Multi-view pose regressionCO3D v2
RRA@1597
31
Multi-view Depth EstimationScanNet (test)
Abs Rel3.3
23
Multi-view pose regressionRealEstate10K
mAA(30)75.1
15
Multi-view Depth EstimationETH3D (test)
Relative Error (rel)2.5
9
Multi-view Depth EstimationKITTI 2015 (test)
Rel Error6.1
9
Multi-view Depth EstimationAverage (KITTI, ScanNet, ETH3D, DTU, T&T) (test)
Relative Error4.7
9
Showing 10 of 20 rows

Other info

Code

Follow for update