Deep Fitting Degree Scoring Network for Monocular 3D Object Detection
About
In this paper, we propose to learn a deep fitting degree scoring network for monocular 3D object detection, which aims to score fitting degree between proposals and object conclusively. Different from most existing monocular frameworks which use tight constraint to get 3D location, our approach achieves high-precision localization through measuring the visual fitting degree between the projected 3D proposals and the object. We first regress the dimension and orientation of the object using an anchor-based method so that a suitable 3D proposal can be constructed. We propose FQNet, which can infer the 3D IoU between the 3D proposals and the object solely based on 2D cues. Therefore, during the detection process, we sample a large number of candidates in the 3D space and project these 3D bounding boxes on 2D image individually. The best candidate can be picked out by simply exploring the spatial overlap between proposals and the object, in the form of the output 3D IoU score of FQNet. Experiments on the KITTI dataset demonstrate the effectiveness of our framework.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| 3D Object Detection | KITTI car (test) | AP3D (Easy)2.77 | 195 | |
| 3D Object Detection | KITTI (val) | -- | 85 | |
| 3D Object Detection | KITTI (test) | AP_3D (Easy)2.77 | 83 | |
| 3D Object Detection | KITTI (val) | -- | 57 | |
| Bird's Eye View (BEV) Detection | KITTI Cars (IoU3D ≥ 0.7) (test) | APBEV R40 (Easy)5.4 | 52 | |
| Monocular 3D Object Detection | KITTI car category (val) | -- | 37 | |
| 3D Object Detection | KITTI (test) | AP_R40 Easy2.77 | 30 | |
| 3D Object Detection | KITTI official Leaderboard (test) | AP3D (Easy)2.77 | 30 | |
| Bird's eye view object detection | KITTI official Leaderboard (test) | APBEV (Easy)5.4 | 29 | |
| 3D Object Detection | KITTI (test) | AP3D (Easy)2.77 | 26 |