Adaptive Unimodal Cost Volume Filtering for Deep Stereo Matching

About

State-of-the-art deep learning based stereo matching approaches treat disparity estimation as a regression problem, where loss function is directly defined on true disparities and their estimated ones. However, disparity is just a byproduct of a matching process modeled by cost volume, while indirectly learning cost volume driven by disparity regression is prone to overfitting since the cost volume is under constrained. In this paper, we propose to directly add constraints to the cost volume by filtering cost volume with unimodal distribution peaked at true disparities. In addition, variances of the unimodal distributions for each pixel are estimated to explicitly model matching uncertainty under different contexts. The proposed architecture achieves state-of-the-art performance on Scene Flow and two KITTI stereo benchmarks. In particular, our method ranked the $1^{st}$ place of KITTI 2012 evaluation and the $4^{th}$ place of KITTI 2015 evaluation (recorded on 2019.8.20). The codes of AcfNet are available at: https://github.com/DeepMotionAIResearch/DenseMatchingBenchmark.

Youmin Zhang, Yimin Chen, Xiao Bai, Suihanjin Yu, Kun Yu, Zhiwei Li, Kuiyuan Yang• 2019

Related benchmarks

Task	Dataset	Result
Stereo Matching	KITTI 2015 (test)	D1 Error (Overall)1.89	233
Stereo Matching	KITTI 2012	Error Rate (3px, All)1.54	108
Stereo Matching	KITTI 2012 (test)	Outlier Rate (3px, Noc)1.17	105
Stereo Matching	KITTI 2015 (all pixels)	D1 Error (Background)1.51	48
Stereo Matching	KITTI Noc 2015	D1 Error (Background)1.36	42
Disparity Estimation	Scene Flow (test)	EPE0.87	24
Stereo Disparity	KITTI 2015	3PE (Non Occlusion Foreground)3.49	12
Stereo Matching	Middlebury half resolution (train)	Cosine Similarity61	12
Stereo Disparity	Scene Flow	EPE0.87	11
Stereo Matching	KITTI 2012 (train)	D2 Error (Noc)1.83	11

Showing 10 of 10 rows

Other info

Code

Follow for update

@wizwand_team Discord