Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

GraphBEV: Towards Robust BEV Feature Alignment for Multi-Modal 3D Object Detection

About

Integrating LiDAR and camera information into Bird's-Eye-View (BEV) representation has emerged as a crucial aspect of 3D object detection in autonomous driving. However, existing methods are susceptible to the inaccurate calibration relationship between LiDAR and the camera sensor. Such inaccuracies result in errors in depth estimation for the camera branch, ultimately causing misalignment between LiDAR and camera BEV features. In this work, we propose a robust fusion framework called Graph BEV. Addressing errors caused by inaccurate point cloud projection, we introduce a Local Align module that employs neighbor-aware depth features via Graph matching. Additionally, we propose a Global Align module to rectify the misalignment between LiDAR and camera BEV features. Our Graph BEV framework achieves state-of-the-art performance, with an mAP of 70.1\%, surpassing BEV Fusion by 1.6\% on the nuscenes validation set. Importantly, our Graph BEV outperforms BEV Fusion by 8.3\% under conditions with misalignment noise.

Ziying Song, Lei Yang, Shaoqing Xu, Lin Liu, Dongyang Xu, Caiyan Jia, Feiyang Jia, Li Wang• 2024

Related benchmarks

TaskDatasetResultRank
3D Object DetectionnuScenes (test)
mAP71.7
903
3D Object DetectionnuScenes (val)
NDS72.9
217
3D Object DetectionArgoverse 2 (val)
mAP41.1
101
3D Object DetectionnuScenes LiDAR Beamsreduce
NDS53.4
41
3D Object DetectionnuScenes Night (val)
mAP45.1
26
3D Object DetectionnuScenes Rainy (val)
mAP70.2
22
3D Object DetectionnuScenes
mAP (All)71.7
19
3D Object DetectionnuScenes Clean
mAP68.9
18
3D Object DetectionnuScenes LiDAR Limited FOV [-60, 60]
NDS40.6
17
3D Object DetectionnuScenes Camera View Drop 6 drops
mAP59.2
17
Showing 10 of 22 rows

Other info

Follow for update