Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

S2R-DepthNet: Learning a Generalizable Depth-specific Structural Representation

About

Human can infer the 3D geometry of a scene from a sketch instead of a realistic image, which indicates that the spatial structure plays a fundamental role in understanding the depth of scenes. We are the first to explore the learning of a depth-specific structural representation, which captures the essential feature for depth estimation and ignores irrelevant style information. Our S2R-DepthNet (Synthetic to Real DepthNet) can be well generalized to unseen real-world data directly even though it is only trained on synthetic data. S2R-DepthNet consists of: a) a Structure Extraction (STE) module which extracts a domaininvariant structural representation from an image by disentangling the image into domain-invariant structure and domain-specific style components, b) a Depth-specific Attention (DSA) module, which learns task-specific knowledge to suppress depth-irrelevant structures for better depth estimation and generalization, and c) a depth prediction module (DP) to predict depth from the depth-specific representation. Without access of any real-world images, our method even outperforms the state-of-the-art unsupervised domain adaptation methods which use real-world images of the target domain for training. In addition, when using a small amount of labeled real-world data, we achieve the state-ofthe-art performance under the semi-supervised setting. The code and trained models are available at https://github.com/microsoft/S2R-DepthNet.

Xiaotian Chen, Yuwang Wang, Xuejin Chen, Wenjun Zeng• 2021

Related benchmarks

TaskDatasetResultRank
Depth EstimationKITTI (Eigen split)
RMSE3.463
276
Depth EstimationNYU Depth V2
RMSE0.662
177
Monocular Depth EstimationMake3D (test)
Abs Rel0.49
132
Depth PredictionCityscapes (test)
RMSE11.164
52
Monocular Depth EstimationKITTI v1 (Eigen split)
Acc (δ < 1.25)79.3
15
Depth EstimationDrivingStereo v1 (test)
Acc (< 1.25)73.7
3
Depth EstimationnuScenes v1 (test)
delta < 1.2560.1
3
Showing 7 of 7 rows

Other info

Code

Follow for update