Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Scale Propagation Network for Generalizable Depth Completion

About

Depth completion, inferring dense depth maps from sparse measurements, is crucial for robust 3D perception. Although deep learning based methods have made tremendous progress in this problem, these models cannot generalize well across different scenes that are unobserved in training, posing a fundamental limitation that yet to be overcome. A careful analysis of existing deep neural network architectures for depth completion, which are largely borrowing from successful backbones for image analysis tasks, reveals that a key design bottleneck actually resides in the conventional normalization layers. These normalization layers are designed, on one hand, to make training more stable, on the other hand, to build more visual invariance across scene scales. However, in depth completion, the scale is actually what we want to robustly estimate in order to better generalize to unseen scenes. To mitigate, we propose a novel scale propagation normalization (SP-Norm) method to propagate scales from input to output, and simultaneously preserve the normalization operator for easy convergence. More specifically, we rescale the input using learned features of a single-layer perceptron from the normalized input, rather than directly normalizing the input as conventional normalization layers. We then develop a new network architecture based on SP-Norm and the ConvNeXt V2 backbone. We explore the composition of various basic blocks and architectures to achieve superior performance and efficient inference for generalizable depth completion. Extensive experiments are conducted on six unseen datasets with various types of sparse depth maps, i.e., randomly sampled 0.1\%/1\%/10\% valid pixels, 4/8/16/32/64-line LiDAR points, and holes from Structured-Light. Our model consistently achieves the best accuracy with faster speed and lower memory when compared to state-of-the-art methods.

Haotian Wang, Meng Yang, Xinhu Zheng, Gang Hua• 2024

Related benchmarks

TaskDatasetResultRank
Depth CompletionKITTI
RMSE2.365
53
Depth CompletionVOID (test)
MAE0.178
34
Depth CompletionETH3D (test)
RMSE0.419
32
Depth CompletionNYU v2 (test)
MAE0.047
21
Point Map EstimationKITTI--
19
Depth CompletionDIODE Outdoor
RMSE2.111
16
Depth CompletionAverage all benchmarks
RMSE1.048
16
Point Map EstimationAverage
Absolute Relative Error (Abs Rel)0.075
16
Point Map EstimationDIODE Outdoor
RELp10
15
Point Map EstimationDIODE Indoor
MAEp0.172
9
Showing 10 of 10 rows

Other info

Follow for update