Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

MLG-Stereo: ViT Based Stereo Matching with Multi-Stage Local-Global Enhancement

About

With the development of deep learning, ViT-based stereo matching methods have made significant progress due to their remarkable robustness and zero-shot ability. However, due to the limitations of ViTs in handling resolution sensitivity and their relative neglect of local information, the ability of ViT-based methods to predict details and handle arbitrary-resolution images is still weaker than that of CNN-based methods. To address these shortcomings, we propose MLG-Stereo, a systematic pipeline-level design that extends global modeling beyond the encoder stage. First, we propose a Multi-Granularity Feature Network to effectively balance global context and local geometric information, enabling comprehensive feature extraction from images of arbitrary resolution and bridging the gap between training and inference scales. Then, a Local-Global Cost Volume is constructed to capture both locally-correlated and global-aware matching information. Finally, a Local-Global Guided Recurrent Unit is introduced to iteratively optimize the disparity locally under the guidance of global information. Extensive experiments are conducted on multiple benchmark datasets, demonstrating that our MLG-Stereo exhibits highly competitive performance on the Middlebury and KITTI-2015 benchmarks compared to contemporaneous leading methods, and achieves outstanding results in the KITTI-2012 dataset.

Haoyu Zhang, Jingyi Zhou, Peng Ye, Jiakang Yuan, Lin Zhang, Feng Xu, Tao Chen• 2026

Related benchmarks

TaskDatasetResultRank
Stereo MatchingKITTI 2015 (test)
D1 Error (Overall)1.36
233
Stereo MatchingKITTI 2012 (test)
Outlier Rate (3px, Noc)0.86
105
Stereo MatchingMiddlebury (test)
EPE0.74
60
Stereo MatchingMiddlebury full resolution
2px Error Rate9.71
21
Stereo MatchingMiddlebury Half resolution (H)
EPE0.76
11
Stereo MatchingMiddlebury Quarter resolution (Q)
EPE (Q)0.53
11
Showing 6 of 6 rows

Other info

Follow for update