ELFNet: Evidential Local-global Fusion for Stereo Matching

About

Although existing stereo matching models have achieved continuous improvement, they often face issues related to trustworthiness due to the absence of uncertainty estimation. Additionally, effectively leveraging multi-scale and multi-view knowledge of stereo pairs remains unexplored. In this paper, we introduce the \textbf{E}vidential \textbf{L}ocal-global \textbf{F}usion (ELF) framework for stereo matching, which endows both uncertainty estimation and confidence-aware fusion with trustworthy heads. Instead of predicting the disparity map alone, our model estimates an evidential-based disparity considering both aleatoric and epistemic uncertainties. With the normal inverse-Gamma distribution as a bridge, the proposed framework realizes intra evidential fusion of multi-level predictions and inter evidential fusion between cost-volume-based and transformer-based stereo matching. Extensive experimental results show that the proposed framework exploits multi-view information effectively and achieves state-of-the-art overall performance both on accuracy and cross-domain generalization. The codes are available at https://github.com/jimmy19991222/ELFNet.

Jieming Lou, Weide Liu, Zhuo Chen, Fayao Liu, Jun Cheng• 2023

Related benchmarks

Task	Dataset	Result
Stereo Matching	KITTI 2015	D1 Error (All)9.61	142
Stereo Matching	KITTI 2012	Error Rate (3px, All)10.52	108
Stereo Matching	ETH3D	Threshold Error > 1px (Noc)24.5	50
Stereo Matching	Booster Q (test)	Error Rate (> 2%)45.52	26
Stereo Matching	Middlebury 2021	Bad Pixel Rate (Thresh > 2.0, All)27.08	24
Stereo Matching	LayeredFlow E (test)	Error Rate (> 1%)93.08	13
Stereo Matching	Middlebury half-resolution 2014 v3 (test)	Bad Error Rate (All)24.48	11

Showing 7 of 7 rows

Other info

Follow for update

@wizwand_team Discord