3D LiDAR and Stereo Fusion using Stereo Matching Network with Conditional Cost Volume Normalization

About

The complementary characteristics of active and passive depth sensing techniques motivate the fusion of the Li-DAR sensor and stereo camera for improved depth perception. Instead of directly fusing estimated depths across LiDAR and stereo modalities, we take advantages of the stereo matching network with two enhanced techniques: Input Fusion and Conditional Cost Volume Normalization (CCVNorm) on the LiDAR information. The proposed framework is generic and closely integrated with the cost volume component that is commonly utilized in stereo matching neural networks. We experimentally verify the efficacy and robustness of our method on the KITTI Stereo and Depth Completion datasets, obtaining favorable performance against various fusion strategies. Moreover, we demonstrate that, with a hierarchical extension of CCVNorm, the proposed method brings only slight overhead to the stereo matching network in terms of computation time and model size. For project page, see https://zswang666.github.io/Stereo-LiDAR-CCVNorm-Project-Page/

Tsun-Hsuan Wang, Hou-Ning Hu, Chieh Hubert Lin, Yi-Hsuan Tsai, Wei-Chen Chiu, Min Sun• 2019

Related benchmarks

Task	Dataset	Result
Depth Completion	KITTI depth completion official (test)	RMSE (mm)749.3	154
Depth Completion	KITTI depth completion (val)	RMSE (mm)749.3	38
Stereo Matching	KITTI Stereo 2015 (test)	Error Rate (> 3px)3.35	6
Depth Estimation	Virtual-KITTI 2.0 (test)	RMSE3.73e+3	4

Showing 4 of 4 rows

Other info

Follow for update

@wizwand_team Discord