Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

StereoNet: Guided Hierarchical Refinement for Real-Time Edge-Aware Depth Prediction

About

This paper presents StereoNet, the first end-to-end deep architecture for real-time stereo matching that runs at 60 fps on an NVidia Titan X, producing high-quality, edge-preserved, quantization-free disparity maps. A key insight of this paper is that the network achieves a sub-pixel matching precision than is a magnitude higher than those of traditional stereo matching approaches. This allows us to achieve real-time performance by using a very low resolution cost volume that encodes all the information needed to achieve high disparity precision. Spatial precision is achieved by employing a learned edge-aware upsampling function. Our model uses a Siamese network to extract features from the left and right image. A first estimate of the disparity is computed in a very low resolution cost volume, then hierarchically the model re-introduces high-frequency details through a learned upsampling function that uses compact pixel-to-pixel refinement networks. Leveraging color input as a guide, this function is capable of producing high-quality edge-aware output. We achieve compelling results on multiple benchmarks, showing how the proposed method offers extreme flexibility at an acceptable computational budget.

Sameh Khamis, Sean Fanello, Christoph Rhemann, Adarsh Kowdle, Julien Valentin, Shahram Izadi• 2018

Related benchmarks

TaskDatasetResultRank
Stereo MatchingKITTI 2015 (test)
D1 Error (Overall)0.0483
144
Stereo MatchingKITTI 2015
D1 Error (All)4.83
118
Stereo MatchingKITTI 2012--
81
Disparity EstimationKITTI 2015 (test)
D1 Error (bg, all)4.3
77
Stereo MatchingKITTI 2012 (test)--
76
Stereo MatchingScene Flow (test)
EPE1.1
70
Disparity EstimationScene Flow (test)--
24
Stereo MatchingScene Flow (finalpass)
EPE (px)1.1
22
Disparity EstimationKITTI 2012 (test)
Mean Error (Noc)0.8
9
Showing 9 of 9 rows

Other info

Follow for update