Local Similarity Pattern and Cost Self-Reassembling for Deep Stereo Matching Networks
About
Although convolution neural network based stereo matching architectures have made impressive achievements, there are still some limitations: 1) Convolutional Feature (CF) tends to capture appearance information, which is inadequate for accurate matching. 2) Due to the static filters, current convolution based disparity refinement modules often produce over-smooth results. In this paper, we present two schemes to address these issues, where some traditional wisdoms are integrated. Firstly, we introduce a pairwise feature for deep stereo matching networks, named LSP (Local Similarity Pattern). Through explicitly revealing the neighbor relationships, LSP contains rich structural information, which can be leveraged to aid CF for more discriminative feature description. Secondly, we design a dynamic self-reassembling refinement strategy and apply it to the cost distribution and the disparity map respectively. The former could be equipped with the unimodal distribution constraint to alleviate the over-smoothing problem, and the latter is more practical. The effectiveness of the proposed methods is demonstrated via incorporating them into two well-known basic architectures, GwcNet and GANet-deep. Experimental results on the SceneFlow and KITTI benchmarks show that our modules significantly improve the performance of the model.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Stereo Matching | KITTI 2015 (test) | D1 Error (Overall)1.67 | 144 | |
| Stereo Matching | KITTI 2012 (test) | Outlier Rate (3px, Noc)1.05 | 76 | |
| Stereo Matching | Scene Flow (test) | EPE0.75 | 70 | |
| Stereo Matching | KITTI 2015 (all pixels) | D1 Error (Background)1.44 | 38 | |
| Stereo Matching | KITTI 2012 (Noc) | Error Rate (>2px)1.72 | 26 | |
| Stereo Matching | KITTI 2012 (All split) | Error Rate (>2px)2.26 | 26 | |
| Stereo Matching | KITTI 2015 (non-occluded) | D1 Error (Background)1.26 | 25 | |
| Stereo Matching | Virtual KITTI (VKITTI) (val) | 3px Error (Fog)1.67 | 5 |