Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Ray-Aware Pointer Memory with Adaptive Updates for Streaming 3D Reconstruction

About

Dense 3D reconstruction from continuous image streams requires both accurate geometric aggregation and stable long-term memory management. Recent feed-forward reconstruction frameworks integrate observations through persistent memory representations, yet most rely primarily on appearance-based similarity when updating memory. Such appearance-driven integration often leads to redundant accumulation of observations and unstable geometry when viewpoint changes occur. In this work, we propose a ray-aware pointer memory for streaming 3D reconstruction that explicitly models both spatial location and viewing direction within a unified memory representation. Each memory pointer stores its 3D position, associated ray direction, and feature embedding, allowing the system to reason jointly about geometric proximity and viewpoint consistency. Based on this representation, we introduce an adaptive pointer update strategy that replaces traditional fusion-based memory compression with a retain-or-replace mechanism. Instead of averaging nearby observations, the system selectively retains informative pointers while discarding redundant ones, preserving distinctive geometric structures while maintaining bounded memory growth. Furthermore, the joint reasoning over spatial distance and ray-direction discrepancy enables the system to distinguish between local redundancy, novel observations, and potential loop revisits in a unified manner. When loop candidates are detected, pose refinement is triggered to enforce global geometric consistency across the reconstruction. Extensive experiments demonstrate that the proposed ray-aware memory design significantly improves long-term reconstruction stability and camera pose accuracy while maintaining efficient streaming inference. Our approach provides a principled framework for scalable and drift-resistant online 3D reconstruction from image streams.

Feifei Li, Qi Song, Chi Zhang, Rui Huang• 2026

Related benchmarks

TaskDatasetResultRank
Camera pose estimationSintel
ATE0.213
203
Depth EstimationKITTI--
156
3D Reconstruction7 Scenes
Accuracy Median1.9
128
Camera pose estimationTUM dynamics
ATE0.049
90
Depth EstimationBONN
Abs Rel0.059
63
3D ReconstructionNRGBD
Accuracy Mean6.1
63
Camera pose estimationScanNet static indoor scenes
ATE0.086
40
Depth EstimationSintel
AbsRel0.376
29
Depth EstimationNYU Static v2
Abs Rel0.073
7
Showing 9 of 9 rows

Other info

Follow for update