Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

An Improved RaftStereo Trained with A Mixed Dataset for the Robust Vision Challenge 2022

About

Stereo-matching is a fundamental problem in computer vision. Despite recent progress by deep learning, improving the robustness is ineluctable when deploying stereo-matching models to real-world applications. Different from the common practices, i.e., developing an elaborate model to achieve robustness, we argue that collecting multiple available datasets for training is a cheaper way to increase generalization ability. Specifically, this report presents an improved RaftStereo trained with a mixed dataset of seven public datasets for the robust vision challenge (denoted as iRaftStereo_RVC). When evaluated on the training sets of Middlebury, KITTI-2015, and ETH3D, the model outperforms its counterparts trained with only one dataset, such as the popular Sceneflow. After fine-tuning the pre-trained model on the three datasets of the challenge, it ranks at 2nd place on the stereo leaderboard, demonstrating the benefits of mixed dataset pre-training.

Hualie Jiang, Rui Xu, Wenjie Jiang• 2022

Related benchmarks

TaskDatasetResultRank
Stereo MatchingKITTI 2015 (non-occluded)
D1 Error (Background)1.76
25
Stereo MatchingMiddlebury non-occluded
Bad Pixel Rate (2.0)8.07
20
Stereo MatchingETH3D (non-occluded)
Bad 1.0 Error1.62
19
Stereo MatchingMiddlebury v3
Average Error2.9
17
Stereo MatchingETH3D RVC (all)
Bad 1.0 Error1.88
9
Stereo MatchingKITTI RVC 2015 (all)
D1 Error (bg)1.88
9
Showing 6 of 6 rows

Other info

Follow for update