Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Improved monocular depth prediction using distance transform over pre-semantic contours with self-supervised neural networks

About

Monocular depth estimation (MDE) with self-supervised training approaches struggles in low-texture areas, where photometric losses may lead to ambiguous depth predictions. To address this, we propose a novel technique that enhances spatial information by applying a distance transform over pre-semantic contours, augmenting discriminative power in low texture regions. Our approach jointly estimates pre-semantic contours, depth and ego-motion. The pre-semantic contours are leveraged to produce new input images, with variance augmented by the distance transform in uniform areas. This approach results in more effective loss functions, enhancing the training process for depth and ego-motion. We demonstrate theoretically that the distance transform is the optimal variance-augmenting technique in this context. Through extensive experiments on KITTI, Cityscapes, Waymo, NYUv2 and ScanNet our model demonstrates robust performance, surpassing competing self-supervised methods in MDE.

Marwane Hariat, Antoine Manzanera, David Filliat• 2026

Related benchmarks

TaskDatasetResultRank
Depth EstimationNYU v2 (test)
Threshold Accuracy (delta < 1.25)85.9
435
Optical FlowKITTI 2015 (test)--
109
Monocular Depth EstimationCityscapes
Accuracy (delta < 1.25)85
74
Monocular Depth EstimationKITTI 2015 (test)
Abs Rel0.116
22
Monocular Depth EstimationKITTI 2015
Abs Rel0.082
14
Depth EstimationScanNet v1 (test)
AbsRel0.127
14
Visual OdometryKITTI Odometry Seq. 09
t_err8.39
12
OdometryKITTI Odometry Sequence 10
Translational Error (%)7.17
9
Depth EstimationWaymo (test)
Abs Rel0.125
2
Depth EstimationKITTI 2015 (test)
Abs Rel0.083
2
Showing 10 of 10 rows

Other info

Follow for update