Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Objects are Different: Flexible Monocular 3D Object Detection

About

The precise localization of 3D objects from a single image without depth information is a highly challenging problem. Most existing methods adopt the same approach for all objects regardless of their diverse distributions, leading to limited performance for truncated objects. In this paper, we propose a flexible framework for monocular 3D object detection which explicitly decouples the truncated objects and adaptively combines multiple approaches for object depth estimation. Specifically, we decouple the edge of the feature map for predicting long-tail truncated objects so that the optimization of normal objects is not influenced. Furthermore, we formulate the object depth estimation as an uncertainty-guided ensemble of directly regressed object depth and solved depths from different groups of keypoints. Experiments demonstrate that our method outperforms the state-of-the-art method by relatively 27\% for the moderate level and 30\% for the hard level in the test set of KITTI benchmark while maintaining real-time efficiency. Code will be available at \url{https://github.com/zhangyp15/MonoFlex}.

Yunpeng Zhang, Jiwen Lu, Jie Zhou• 2021

Related benchmarks

TaskDatasetResultRank
3D Object DetectionKITTI car (test)
AP3D (Easy)19.94
226
3D Object DetectionKITTI car (val)
AP 3D Easy23.64
110
3D Object DetectionKITTI (val)
AP3D (Moderate)17.51
85
3D Object DetectionKITTI Pedestrian (test)
AP3D (Easy)1.19e+3
75
3D Object DetectionKITTI (test)
3D AP Easy19.94
61
Bird's Eye View Object Detection (Car)KITTI (test)
APBEV (Easy) @IoU=0.728.23
59
3D Object DetectionKITTI Cyclist (test)
AP3D Easy3.39
59
Bird's eye view object detectionKITTI (test)
APBEV@0.7 (Easy)28.23
53
Monocular 3D Object DetectionKITTI (test)
AP3D R40 (Mod.)12.57
44
3D Object DetectionKITTI (test)
3D AP (Easy)19.94
43
Showing 10 of 35 rows

Other info

Code

Follow for update