Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Flash-Mono: Feed-Forward Accelerated Gaussian Splatting Monocular SLAM

About

Monocular 3D Gaussian Splatting SLAM suffers from critical limitations in time efficiency, geometric accuracy, and multi-view consistency. These issues stem from the time-consuming $\textit{Train-from-Scratch}$ optimization and the lack of inter-frame scale consistency from single-frame geometry priors. We contend that a feed-forward paradigm, leveraging multi-frame context to predict Gaussian attributes directly, is crucial for addressing these challenges. We present Flash-Mono, a system composed of three core modules: a feed-forward prediction frontend, a 2D Gaussian Splatting mapping backend, and an efficient hidden-state-based loop closure module. We trained a recurrent feed-forward frontend model that progressively aggregates multi-frame visual features into a hidden state via cross attention and jointly predicts camera poses and per-pixel Gaussian properties. By directly predicting Gaussian attributes, our method bypasses the burdensome per-frame optimization required in optimization-based GS-SLAM, achieving a $\textbf{10x}$ speedup while ensuring high-quality rendering. The power of our recurrent architecture extends beyond efficient prediction. The hidden states act as compact submap descriptors, facilitating efficient loop closure and global $\mathrm{Sim}(3)$ optimization to mitigate the long-standing challenge of drift. For enhanced geometric fidelity, we replace conventional 3D Gaussian ellipsoids with 2D Gaussian surfels. Extensive experiments demonstrate that Flash-Mono achieves state-of-the-art performance in both tracking and mapping quality, highlighting its potential for embodied perception and real-time reconstruction applications. Project page: https://victkk.github.io/flash-mono.

Zicheng Zhang, Ke Wu, Xiangting Meng, Keyu Liu, Jieru Zhao, Wenchao Ding• 2026

Related benchmarks

TaskDatasetResultRank
Mapping QualityScanNet V1
SSIM79
24
Mapping QualityBundleFusion
SSIM0.72
20
Visual OdometryKITTI Odometry Sequence 05
RMSE16.58
10
Visual OdometryKITTI Odometry Sequence 00
RMSE12.85
8
Visual OdometryKITTI Odometry Sequence 06
RMSE9.93
8
Visual OdometryKITTI Odometry Sequence 08
RMSE45.25
8
Tracking PerformanceScanNet V1
Tracking Metric 005411.69
7
Tracking PerformanceBundleFusion
Tracking Score (apt0)11.44
7
SLAMTUM fr3/office
Total Gaussian Count0.61
6
SLAMTUM fr2 xyz
Total Gaussian Count0.98
6
Showing 10 of 20 rows

Other info

Follow for update