MonoCD: Monocular 3D Object Detection with Complementary Depths

About

Monocular 3D object detection has attracted widespread attention due to its potential to accurately obtain object 3D localization from a single image at a low cost. Depth estimation is an essential but challenging subtask of monocular 3D object detection due to the ill-posedness of 2D to 3D mapping. Many methods explore multiple local depth clues such as object heights and keypoints and then formulate the object depth estimation as an ensemble of multiple depth predictions to mitigate the insufficiency of single-depth information. However, the errors of existing multiple depths tend to have the same sign, which hinders them from neutralizing each other and limits the overall accuracy of combined depth. To alleviate this problem, we propose to increase the complementarity of depths with two novel designs. First, we add a new depth prediction branch named complementary depth that utilizes global and efficient depth clues from the entire image rather than the local clues to reduce the correlation of depth predictions. Second, we propose to fully exploit the geometric relations between multiple depth clues to achieve complementarity in form. Benefiting from these designs, our method achieves higher complementarity. Experiments on the KITTI benchmark demonstrate that our method achieves state-of-the-art performance without introducing extra data. In addition, complementary depth can also be a lightweight and plug-and-play module to boost multiple existing monocular 3d object detectors. Code is available at https://github.com/elvintanhust/MonoCD.

Longfei Yan, Pei Yan, Shengzhou Xiong, Xuanyu Xiang, Yihua Tan• 2024

Related benchmarks

Task	Dataset	Result
3D Object Detection	KITTI car (test)	AP3D (Easy)25.53	226
3D Object Detection	KITTI car (val)	AP 3D Easy26.45	110
Bird's Eye View Object Detection (Car)	KITTI (test)	APBEV (Easy) @IoU=0.733.41	59
Bird's Eye View (BEV) Detection	KITTI Cars (IoU3D ≥ 0.7) (test)	APBEV R40 (Easy)33.41	52
Monocular 3D Object Detection	KITTI (test)	AP3D R40 (Mod.)16.59	44
3D Object Detection	KITTI (test)	3D AP (Easy)25.53	43
Monocular 3D Object Detection	KITTI car category (val)	AP 3D (R40)19.37	37
3D Object Detection	KITTI official (val)	AP40 Easy26.45	31
3D Object Detection	KITTI official (test)	AP3D Easy25.53	29
3D Object Detection	KITTI (val)	AP3D (Easy)26.45	28

Showing 10 of 21 rows

Other info

Code

Follow for update

@wizwand_team Discord