Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Stereo Matching by Training a Convolutional Neural Network to Compare Image Patches

About

We present a method for extracting depth information from a rectified image pair. Our approach focuses on the first stage of many stereo algorithms: the matching cost computation. We approach the problem by learning a similarity measure on small image patches using a convolutional neural network. Training is carried out in a supervised manner by constructing a binary classification data set with examples of similar and dissimilar pairs of patches. We examine two network architectures for this task: one tuned for speed, the other for accuracy. The output of the convolutional neural network is used to initialize the stereo matching cost. A series of post-processing steps follow: cross-based cost aggregation, semiglobal matching, a left-right consistency check, subpixel enhancement, a median filter, and a bilateral filter. We evaluate our method on the KITTI 2012, KITTI 2015, and Middlebury stereo data sets and show that it outperforms other approaches on all three data sets.

Jure \v{Z}bontar, Yann LeCun• 2015

Related benchmarks

TaskDatasetResultRank
Stereo MatchingKITTI 2015 (test)
D1 Error (Overall)3.88
144
Stereo MatchingKITTI 2015
D1 Error (All)3.89
118
Stereo MatchingKITTI 2012
Error Rate (3px, Noc)0.0209
81
Disparity EstimationKITTI 2015 (test)
D1 Error (bg, all)2.48
77
Stereo MatchingKITTI 2012 (test)
Outlier Rate (3px, Noc)2.09
76
Stereo MatchingKITTI Noc 2015
D1 Error (Background)2.48
32
Stereo MatchingKITTI 2012 (Noc)
Error Rate (>2px)3.9
26
Stereo MatchingKITTI 2012 (All split)
Error Rate (>2px)5.45
26
Disparity EstimationScene Flow (test)
EPE3.79
24
Stereo MatchingMiddlebury v3
Average Error3.82
17
Showing 10 of 13 rows

Other info

Follow for update