Large Depth Completion Model from Sparse Observations
About
This work presents the Large Depth Completion Model (LDCM), a simple, effective, and robust framework for single-view metric depth estimation with sparse observations. Without relying on complex architectural designs, LDCM generates metric-accurate dense depth maps using a transformer. It outperforms existing approaches across diverse datasets and sparse observations. We achieve this from two key perspectives: (1) leveraging existing monocular foundation models to improve the quality of sparse depth inputs, and (2) reformulating training objectives to better capture geometric structure and metric consistency. Specifically, a Poisson-based depth initialization strategy is first introduced to generate a uniform coarse dense depth map from diverse sparse observations, providing a strong structural prior for the network. Regarding the training objective, we replace the conventional depth head with a point map head that regresses per-pixel 3D coordinates in camera space, enabling the model to directly learn the underlying 3D scene structure instead of performing pixel-wise depth map restoration. Moreover, this design eliminates the need for camera intrinsic parameters, allowing LDCM to naturally produce metric-scaled 3D point maps. Extensive experiments demonstrate that LDCM consistently outperforms state-of-the-art methods across multiple benchmarks and varying sparsity levels in both depth completion and point map estimation, showcasing its effectiveness and strong generalization to unseen data distributions.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Depth Completion | KITTI | RMSE1.911 | 53 | |
| Depth Completion | VOID (test) | MAE0.145 | 34 | |
| Depth Completion | ETH3D (test) | RMSE0.187 | 32 | |
| Depth Estimation | DIODE Indoor | Relative Error (REL)0.014 | 24 | |
| Depth Completion | NYU v2 (test) | MAE0.037 | 21 | |
| Point Map Estimation | KITTI | -- | 19 | |
| Depth Completion | DIODE Outdoor | RMSE1.969 | 16 | |
| Depth Completion | Average all benchmarks | RMSE0.862 | 16 | |
| Point Map Estimation | Average | Absolute Relative Error (Abs Rel)0.042 | 16 | |
| Point Map Estimation | DIODE Outdoor | RELp4.4 | 15 |