Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

PromptDepthAnything++: Accurate 4K Metric Depth Estimation via Pattern-Agnostic Prompting

About

Prompts play a critical role in unleashing the power of language and vision foundation models for specific tasks. For the first time, we introduce prompting into depth foundation models, creating a new paradigm for metric depth estimation termed Prompt Depth Anything. Specifically, we use a low-cost LiDAR as the prompt to guide the Depth Anything model for accurate metric depth output, achieving up to 4K resolution. Our approach centers on a concise prompt fusion design that integrates the LiDAR at multiple scales within the depth decoder. To address training challenges posed by limited datasets containing both LiDAR depth and precise GT depth, we propose a scalable data pipeline that includes synthetic data LiDAR simulation and real data pseudo GT depth generation. To further extend our method to work with any prompt depth points, we propose a new prompting mechanism, which serializes the input depth points into tokens and uses self-attention to enhance image tokens from depth foundation models. Our approach sets new state-of-the-arts on 8 zero-shot depth benchmarks and benefits downstream applications, including 3D reconstruction and generalized robotic grasping. The code is available at https://github.com/DepthAnything/PromptDA .

Haotong Lin, Sida Peng, Qinglin Yang, Peishan Yang, Jiaming Sun, Ruizhen Hu, Kai Xu, Hujun Bao, Bingyi Kang, Xiaowei Zhou• 2024

Related benchmarks

TaskDatasetResultRank
Depth CompletionNYU-depth-v2 official (test)--
187
Depth CompletionKITTI (test)--
67
Depth EstimationARKitScenes
L1 Error0.0132
57
Depth Super-Resolution / CompletionETH-3D (test)
AbsRel1.04
41
3D ReconstructionDTU
Average Error1.02
32
Depth EstimationScanNet++
L1 Loss0.025
15
TSDF ReconstructionScanNet++
Accuracy7.46
15
Depth CompletionScanNet (test)
MAE0.017
10
Depth CompletionReplica
F-Score (Avg)90
7
Depth CompletionDIODE (test)
AbsRel0.255
5
Showing 10 of 11 rows

Other info

Follow for update