Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

TTSA3R: Training-Free Temporal-Spatial Adaptive Persistent State for Streaming 3D Reconstruction

About

Streaming recurrent models enable efficient 3D reconstruction by maintaining persistent state representations. However, they suffer from catastrophic forgetting over long sequences due to balancing historical information with new observations. Recent methods alleviate this by deriving adaptive signals from attention perspective, but they operate on single dimensions without considering temporal and spatial consistency. To this end, we propose a training-free framework termed TTSA3R that leverages both temporal state evolution and spatial observation quality for adaptive state updates in 3D reconstruction. In particular, we devise a Temporal Adaptive Update Module that regulates update magnitude by analyzing temporal state evolution patterns. Then, a Spatial Contextual Update Module is introduced to localize spatial regions that require updates through observation-state alignment and scene dynamics. These complementary signals are finally fused to determine the state updating strategies. Extensive experiments demonstrate the effectiveness of TTSA3R in diverse 3D tasks. Moreover, our method exhibits only 1.33x error increase compared to over 4x degradation in the baseline model on extended sequences of 3D reconstruction, significantly improving long-term reconstruction stability. Our codes are available at https://github.com/anonus2357/ttsa3r.

Zhijie Zheng, Xinhao Xiang, Jiawei Zhang• 2026

Related benchmarks

TaskDatasetResultRank
Camera pose estimationTUM-dynamic
ATE0.012
205
Camera pose estimationSintel
ATE0.21
203
Camera pose estimationScanNet
RPE (t)0.02
133
3D Reconstruction7 Scenes--
128
Camera pose estimationTUM dynamics
ATE0.026
90
Depth EstimationSintel ~50 frames
AbsRel0.402
70
Depth EstimationKITTI 110 frames
AbsRel11
69
3D ReconstructionNRGBD
Normalized Score (NC)63
66
3D ReconstructionNRGBD
Accuracy Mean12.1
63
Video Depth EstimationBonn 110 frames
AbsRel6.4
63
Showing 10 of 17 rows

Other info

Follow for update