Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

FlashDepth: Real-time Streaming Video Depth Estimation at 2K Resolution

About

A versatile video depth estimation model should (1) be accurate and consistent across frames, (2) produce high-resolution depth maps, and (3) support real-time streaming. We propose FlashDepth, a method that satisfies all three requirements, performing depth estimation on a 2044x1148 streaming video at 24 FPS. We show that, with careful modifications to pretrained single-image depth models, these capabilities are enabled with relatively little data and training. We evaluate our approach across multiple unseen datasets against state-of-the-art depth models, and find that ours outperforms them in terms of boundary sharpness and speed by a significant margin, while maintaining competitive accuracy. We hope our model will enable various applications that require high-resolution depth, such as video editing, and online decision-making, such as robotics. We release all code and model weights at https://github.com/Eyeline-Research/FlashDepth

Gene Chou, Wenqi Xian, Guandao Yang, Mohamed Abdelfattah, Bharath Hariharan, Noah Snavely, Ning Yu, Paul Debevec• 2025

Related benchmarks

TaskDatasetResultRank
Depth EstimationKITTI--
156
Monocular Depth EstimationSintel
Abs Rel0.288
127
Depth EstimationSintel ~50 frames
AbsRel0.265
70
Depth EstimationKITTI 110 frames
AbsRel10.3
69
Monocular Depth EstimationKITTI
AbsRel8.4
69
Video Depth EstimationBonn 110 frames
AbsRel5.3
63
Monocular Depth EstimationBONN
Delta 1.25 Accuracy96.7
60
Depth EstimationSintel
AbsRel0.36
29
Video Depth EstimationScannet 90 frames
AbsRel0.101
22
Depth EstimationTUM-RGBD
Abs Rel Error0.08
16
Showing 10 of 20 rows

Other info

Follow for update