Moving Indoor: Unsupervised Video Depth Learning in Challenging Environments

About

Recently unsupervised learning of depth from videos has made remarkable progress and the results are comparable to fully supervised methods in outdoor scenes like KITTI. However, there still exist great challenges when directly applying this technology in indoor environments, e.g., large areas of non-texture regions like white wall, more complex ego-motion of handheld camera, transparent glasses and shiny objects. To overcome these problems, we propose a new optical-flow based training paradigm which reduces the difficulty of unsupervised learning by providing a clearer training target and handles the non-texture regions. Our experimental evaluation demonstrates that the result of our method is comparable to fully supervised methods on the NYU Depth V2 benchmark. To the best of our knowledge, this is the first quantitative result of purely unsupervised learning method reported on indoor datasets.

Junsheng Zhou, Yuwang Wang, Kaihuai Qin, Wenjun Zeng• 2019

Related benchmarks

Task	Dataset	Result
Depth Estimation	NYU v2 (test)	Threshold Accuracy (delta < 1.25)67.4	435
Monocular Depth Estimation	NYU v2 (test)	Abs Rel0.208	320
Surface Normal Estimation	NYU v2 (test)	Mean Angle Distance (MAD)43.5	224
Monocular Depth Estimation	KITTI	Abs Rel0.121	220
Depth Prediction	NYU Depth V2 (test)	Accuracy (δ < 1.25)67.4	113
Depth Estimation	ScanNet (test)	Abs Rel0.212	65
Single-view depth estimation	NYUv2 36 (test)	AbsRel0.208	21
Single-view depth estimation	NYU official 654 images v2 (test)	AbsRel0.208	21
Camera pose estimation	ScanNet 42 (test)	Rotation Error (deg)1.96	4

Showing 9 of 9 rows

Other info

Follow for update

@wizwand_team Discord