BEVDepth: Acquisition of Reliable Depth for Multi-view 3D Object Detection
About
In this research, we propose a new 3D object detector with a trustworthy depth estimation, dubbed BEVDepth, for camera-based Bird's-Eye-View (BEV) 3D object detection. Our work is based on a key observation -- depth estimation in recent approaches is surprisingly inadequate given the fact that depth is essential to camera 3D detection. Our BEVDepth resolves this by leveraging explicit depth supervision. A camera-awareness depth estimation module is also introduced to facilitate the depth predicting capability. Besides, we design a novel Depth Refinement Module to counter the side effects carried by imprecise feature unprojection. Aided by customized Efficient Voxel Pooling and multi-frame mechanism, BEVDepth achieves the new state-of-the-art 60.9% NDS on the challenging nuScenes test set while maintaining high efficiency. For the first time, the NDS score of a camera model reaches 60%.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| 3D Object Detection | nuScenes (val) | NDS55.8 | 941 | |
| 3D Object Detection | nuScenes (test) | mAP52 | 829 | |
| 3D Object Detection | NuScenes v1.0 (test) | mAP52 | 210 | |
| 3D Object Detection | nuScenes v1.0 (val) | mAP (Overall)41.2 | 190 | |
| 3D Object Detection | nuScenes v1.0-trainval (val) | NDS47.5 | 87 | |
| Object Detection | nuScenes (val) | mAP41.8 | 41 | |
| 3D Object Detection | Rope3D (val) | AP (IoU=0.5, Car)69.63 | 31 | |
| 3D Object Detection | DAIR-V2X-I (val) | -- | 25 | |
| 3D Object Detection | Rope3D Car category heterologous benchmark (test) | AP0.85 | 18 | |
| 3D Object Detection | Rope3D Big Vehicle category heterologous benchmark (test) | AP0.3 | 18 |