Hierarchical Deep Stereo Matching on High-resolution Images

About

We explore the problem of real-time stereo matching on high-res imagery. Many state-of-the-art (SOTA) methods struggle to process high-res imagery because of memory constraints or speed limitations. To address this issue, we propose an end-to-end framework that searches for correspondences incrementally over a coarse-to-fine hierarchy. Because high-res stereo datasets are relatively rare, we introduce a dataset with high-res stereo pairs for both training and evaluation. Our approach achieved SOTA performance on Middlebury-v3 and KITTI-15 while running significantly faster than its competitors. The hierarchical design also naturally allows for anytime on-demand reports of disparity by capping intermediate coarse results, allowing us to accurately predict disparity for near-range structures with low latency (30ms). We demonstrate that the performance-vs-speed trade-off afforded by on-demand hierarchies may address sensing needs for time-critical applications such as autonomous driving.

Gengshan Yang, Joshua Manela, Michael Happold, Deva Ramanan• 2019

Related benchmarks

Task	Dataset	Result
Stereo Matching	KITTI 2015 (test)	D1 Error (Overall)2.14	245
Stereo Matching	KITTI 2015	D1 Error (All)3.74	142
Stereo Matching	KITTI 2012 (test)	Outlier Rate (3px, Noc)1.53	105
Stereo Matching	ETH3D	bad 1.04.4	95
Stereo Matching	Middlebury	Bad Pixel Rate (Thresh 2.0)16.5	84
Disparity Estimation	KITTI 2015 (test)	D1 Error (bg, all)1.8	77
Stereo Matching	Middlebury v3	Bad Pixel Rate (Thresh 2.0)10.2	35
Depth Estimation	Gated Stereo Day 1.0 (test)	RMSE10.36	19
Depth Estimation	Gated Stereo Night 1.0 (test)	RMSE12.42	19
Stereo Depth Estimation	SQUID zero-shot	Relative Error (Rel)0.9772	16

Showing 10 of 16 rows

Other info

Follow for update

@wizwand_team Discord