Feature-metric Loss for Self-supervised Learning of Depth and Egomotion

About

Photometric loss is widely used for self-supervised depth and egomotion estimation. However, the loss landscapes induced by photometric differences are often problematic for optimization, caused by plateau landscapes for pixels in textureless regions or multiple local minima for less discriminative pixels. In this work, feature-metric loss is proposed and defined on feature representation, where the feature representation is also learned in a self-supervised manner and regularized by both first-order and second-order derivatives to constrain the loss landscapes to form proper convergence basins. Comprehensive experiments and detailed analysis via visualization demonstrate the effectiveness of the proposed feature-metric loss. In particular, our method improves state-of-the-art methods on KITTI from 0.885 to 0.925 measured by $\delta_1$ for depth estimation, and significantly outperforms previous method for visual odometry.

Chang Shu, Kun Yu, Zhixiang Duan, Kuiyuan Yang• 2020

Related benchmarks

Task	Dataset	Result
Monocular Depth Estimation	KITTI (Eigen)	Abs Rel0.079	523
Depth Estimation	KITTI (Eigen split)	RMSE4.427	291
Stereo Matching	KITTI 2015 (test)	--	233
Monocular Depth Estimation	KITTI (Eigen split)	Abs Rel0.104	215
Monocular Depth Estimation	DDAD (test)	RMSE12.45	122
Monocular Depth Estimation	KITTI (test)	Abs Rel Error0.099	114
Monocular Depth Estimation	KITTI 2015 (Eigen split)	Abs Rel0.099	95
Stereo Matching	Middlebury (test)	EPE1.43	60
Stereo Matching	Middlebury	--	53
Stereo Matching	Inria SLFD	3 Pixel Error12.97	41

Showing 10 of 18 rows

Other info

Follow for update

@wizwand_team Discord