Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

AANet: Adaptive Aggregation Network for Efficient Stereo Matching

About

Despite the remarkable progress made by learning based stereo matching algorithms, one key challenge remains unsolved. Current state-of-the-art stereo models are mostly based on costly 3D convolutions, the cubic computational complexity and high memory consumption make it quite expensive to deploy in real-world applications. In this paper, we aim at completely replacing the commonly used 3D convolutions to achieve fast inference speed while maintaining comparable accuracy. To this end, we first propose a sparse points based intra-scale cost aggregation method to alleviate the well-known edge-fattening issue at disparity discontinuities. Further, we approximate traditional cross-scale cost aggregation algorithm with neural network layers to handle large textureless regions. Both modules are simple, lightweight, and complementary, leading to an effective and efficient architecture for cost aggregation. With these two modules, we can not only significantly speed up existing top-performing models (e.g., $41\times$ than GC-Net, $4\times$ than PSMNet and $38\times$ than GA-Net), but also improve the performance of fast stereo models (e.g., StereoNet). We also achieve competitive results on Scene Flow and KITTI datasets while running at 62ms, demonstrating the versatility and high efficiency of the proposed method. Our full framework is available at https://github.com/haofeixu/aanet .

Haofei Xu, Juyong Zhang• 2020

Related benchmarks

TaskDatasetResultRank
Stereo MatchingKITTI 2015 (test)
D1 Error (Overall)0.0255
144
Stereo MatchingKITTI 2015
D1 Error (All)2.03
118
Stereo MatchingKITTI 2012
Error Rate (3px, Noc)1.55
81
Disparity EstimationKITTI 2015 (test)
D1 Error (bg, all)1.8
77
Stereo MatchingKITTI 2012 (test)
Outlier Rate (3px, Noc)1.55
76
Stereo MatchingScene Flow (test)
EPE0.72
70
Stereo MatchingETH3D
bad 1.05.41
51
Stereo MatchingScene Flow
EPE (px)0.87
40
Stereo MatchingMiddlebury
Bad Pixel Rate (Thresh 2.0)31.8
34
Stereo MatchingETH3D (test)
Error Rate (Th=1.0)5.01
30
Showing 10 of 19 rows

Other info

Code

Follow for update