Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

IGEV++: Iterative Multi-range Geometry Encoding Volumes for Stereo Matching

About

Stereo matching is a core component in many computer vision and robotics systems. Despite significant advances over the last decade, handling matching ambiguities in ill-posed regions and large disparities remains an open challenge. In this paper, we propose a new deep network architecture, called IGEV++, for stereo matching. The proposed IGEV++ constructs Multi-range Geometry Encoding Volumes (MGEV), which encode coarse-grained geometry information for ill-posed regions and large disparities, while preserving fine-grained geometry information for details and small disparities. To construct MGEV, we introduce an adaptive patch matching module that efficiently and effectively computes matching costs for large disparity ranges and/or ill-posed regions. We further propose a selective geometry feature fusion module to adaptively fuse multi-range and multi-granularity geometry features in MGEV. Then, we input the fused geometry features into ConvGRUs to iteratively update the disparity map. MGEV allows to efficiently handle large disparities and ill-posed regions, such as occlusions and textureless regions, and enjoys rapid convergence during iterations. Our IGEV++ achieves the best performance on the Scene Flow test set across all disparity ranges, up to 768px. Our IGEV++ also achieves state-of-the-art accuracy on the Middlebury, ETH3D, KITTI 2012, and 2015 benchmarks. Specifically, IGEV++ achieves a 3.23\% 2-pixel outlier rate (Bad 2.0) on the large disparity benchmark, Middlebury, representing error reductions of 31.9\% and 54.8\% compared to RAFT-Stereo and GMStereo, respectively. We also present a real-time version of IGEV++ that achieves the best performance among all published real-time methods on the KITTI benchmarks. The code is publicly available at https://github.com/gangweix/IGEV and https://github.com/gangweix/IGEV-plusplus.

Gangwei Xu, Xianqi Wang, Zhaoxing Zhang, Junda Cheng, Chunyuan Liao, Xin Yang• 2024

Related benchmarks

TaskDatasetResultRank
Stereo MatchingKITTI 2015
D1 Error (All)1.79
118
Stereo MatchingKITTI 2012
Error Rate (3px, Noc)1.29
81
Stereo MatchingKITTI 2012 (test)--
76
Stereo MatchingScene Flow (test)
EPE0.5
70
Stereo MatchingETH3D
bad 1.01.58
51
Stereo MatchingScene Flow
EPE (px)0.52
40
Stereo MatchingKITTI 2015 (all pixels)
D1 Error (Background)1.31
38
Stereo MatchingMiddlebury
Bad Pixel Rate (Thresh 2.0)7.19
34
Stereo MatchingETH3D
Threshold Error > 1px (All)4.45
30
Stereo MatchingKITTI 2012 (Noc)
Error Rate (>2px)1.56
26
Showing 10 of 19 rows

Other info

Follow for update