Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

$R^3$: 3D Reconstruction via Relative Regression

About

Recent feed-forward geometry foundation models have demonstrated impressive generalization by recovering depth and poses in a single forward pass. However, these models are typically constrained by a global coordinate frame assumption. This dependency becomes a significant bottleneck for long-context and streaming reconstruction, as it forces the network to maintain an arbitrary temporal origin and handle translation magnitudes that grow unbounded over time. Our solution, which we call $R^3$, employs relative regression. We employ a lightweight MLP to predict confidence-weighted relative constraints. These confidences serve as a unified anchor: weighting losses during training and guiding pose aggregation during inference. $R^3$ supports both full-context offline reconstruction and causal, bounded-memory streaming. Our evaluation in both offline and streaming settings validates the effectiveness of our relative mechanism. Project page: https://kevinxu02.github.io/r3-site

Congrong Xu, Huachen Gao, Xingyu Chen, Yuliang Xiu, Jun Gao, Anpei Chen• 2026

Related benchmarks

TaskDatasetResultRank
Pose EstimationETH3D--
49
Camera pose estimationSintel ~50 frames
ATE0.115
41
Camera pose estimationTUM-dynamics 90 frames
ATE0.012
24
Camera pose estimationScannet 90 frames
ATE0.037
24
Point Map Estimation7-Scenes sparse view
Mean Accuracy9.2
17
Point Map EstimationNRGBD sparse view
Accuracy (Mean)4.7
17
3D Reconstruction7-Scenes Length 200
Accuracy (Mean)0.021
10
3D Reconstruction7-Scenes length 1000
Accuracy (Mean)2.2
9
3D Reconstruction7-Scenes length 500
Accuracy2.2
6
Pose EstimationRobustNeRF
ATE0.152
4
Showing 10 of 11 rows

Other info

Follow for update