Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Geometric Context Transformer for Streaming 3D Reconstruction

About

Streaming 3D reconstruction aims to recover 3D information, such as camera poses and point clouds, from a video stream, which necessitates geometric accuracy, temporal consistency, and computational efficiency. Motivated by the principles of Simultaneous Localization and Mapping (SLAM), we introduce LingBot-Map, a feed-forward 3D foundation model for reconstructing scenes from streaming data, built upon a geometric context transformer (GCT) architecture. A defining aspect of LingBot-Map lies in its carefully designed attention mechanism, which integrates an anchor context, a pose-reference window, and a trajectory memory to address coordinate grounding, dense geometric cues, and long-range drift correction, respectively. This design keeps the streaming state compact while retaining rich geometric context, enabling stable efficient inference at around 20 FPS on 518 x 378 resolution inputs over long sequences exceeding 10,000 frames. Extensive evaluations across a variety of benchmarks demonstrate that our approach achieves superior performance compared to both existing streaming and iterative optimization-based approaches.

Lin-Zhuo Chen, Jian Gao, Yihang Chen, Ka Leong Cheng, Yipengjing Sun, Liangxiao Hu, Nan Xue, Xing Zhu, Yujun Shen, Yao Yao, Yinghao Xu• 2026

Related benchmarks

TaskDatasetResultRank
Video Depth EstimationKITTI
Abs Rel0.098
148
3D Reconstruction7 Scenes--
128
3D ReconstructionNRGBD--
66
Pose EstimationETH3D
AUC @ Threshold 30.2779
49
3D Geometry Estimation and ReconstructionSpatialBench Average across settings
Absolute Relative Error18.1
42
3D Geometry Estimation and ReconstructionSpatialBench Medium
AbsRel0.114
42
3D Geometry Estimation and ReconstructionSpatialBench Single Frame
AbsRel0.333
42
3D Geometry Estimation and ReconstructionSpatialBench Sparse
AbsRel0.138
42
3D ReconstructionETH3D
F1 Score98.98
35
Camera pose estimationOxford Spires
ATE15.46
26
Showing 10 of 25 rows

Other info

GitHub

Follow for update