Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Fast3R: Towards 3D Reconstruction of 1000+ Images in One Forward Pass

About

Multi-view 3D reconstruction remains a core challenge in computer vision, particularly in applications requiring accurate and scalable representations across diverse perspectives. Current leading methods such as DUSt3R employ a fundamentally pairwise approach, processing images in pairs and necessitating costly global alignment procedures to reconstruct from multiple views. In this work, we propose Fast 3D Reconstruction (Fast3R), a novel multi-view generalization to DUSt3R that achieves efficient and scalable 3D reconstruction by processing many views in parallel. Fast3R's Transformer-based architecture forwards N images in a single forward pass, bypassing the need for iterative alignment. Through extensive experiments on camera pose estimation and 3D reconstruction, Fast3R demonstrates state-of-the-art performance, with significant improvements in inference speed and reduced error accumulation. These results establish Fast3R as a robust alternative for multi-view applications, offering enhanced scalability without compromising reconstruction accuracy.

Jianing Yang, Alexander Sax, Kevin J. Liang, Mikael Henaff, Hao Tang, Ang Cao, Joyce Chai, Franziska Meier, Matt Feiszli• 2025

Related benchmarks

TaskDatasetResultRank
Monocular Depth EstimationKITTI
Abs Rel0.12
161
Monocular Depth EstimationNYU V2--
113
Video Depth EstimationSintel
Relative Error (Rel)0.518
109
Video Depth EstimationBONN
Relative Error (Rel)0.193
103
Camera pose estimationSintel
ATE0.371
92
Camera pose estimationScanNet
ATE RMSE (Avg.)0.155
61
Camera pose estimationTUM dynamics
RRE1.425
57
3D ReconstructionDTU
Accuracy Median1.706
47
Video Depth EstimationKITTI
Abs Rel0.138
47
3D ReconstructionNeural RGB-D (NRGBD)
Acc Mean0.135
38
Showing 10 of 34 rows

Other info

Code

Follow for update