Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Attention Concatenation Volume for Accurate and Efficient Stereo Matching

About

Stereo matching is a fundamental building block for many vision and robotics applications. An informative and concise cost volume representation is vital for stereo matching of high accuracy and efficiency. In this paper, we present a novel cost volume construction method which generates attention weights from correlation clues to suppress redundant information and enhance matching-related information in the concatenation volume. To generate reliable attention weights, we propose multi-level adaptive patch matching to improve the distinctiveness of the matching cost at different disparities even for textureless regions. The proposed cost volume is named attention concatenation volume (ACV) which can be seamlessly embedded into most stereo matching networks, the resulting networks can use a more lightweight aggregation network and meanwhile achieve higher accuracy, e.g. using only 1/25 parameters of the aggregation network can achieve higher accuracy for GwcNet. Furthermore, we design a highly accurate network (ACVNet) based on our ACV, which achieves state-of-the-art performance on several benchmarks.

Gangwei Xu, Junda Cheng, Peng Guo, Xin Yang• 2022

Related benchmarks

TaskDatasetResultRank
Stereo MatchingKITTI 2015 (test)
D1 Error (Overall)0.0234
144
Stereo MatchingKITTI 2015
D1 Error (All)1.65
118
Stereo MatchingKITTI 2012
Error Rate (3px, Noc)1.13
81
Stereo MatchingKITTI 2012 (test)
Outlier Rate (3px, Noc)1.13
76
Stereo MatchingScene Flow (test)
EPE0.48
70
Stereo MatchingScene Flow
EPE (px)0.48
40
Stereo MatchingKITTI 2015 (all pixels)
D1 Error (Background)1.37
38
Stereo MatchingKITTI Noc 2015
D1 Error (Background)1.26
32
Stereo MatchingKITTI 2012 (Noc)
Error Rate (>2px)1.83
26
Stereo MatchingKITTI 2012 (All split)
Error Rate (>2px)2.34
26
Showing 10 of 19 rows

Other info

Code

Follow for update