The Devil is in the Task: Exploiting Reciprocal Appearance-Localization Features for Monocular 3D Object Detection
About
Low-cost monocular 3D object detection plays a fundamental role in autonomous driving, whereas its accuracy is still far from satisfactory. In this paper, we dig into the 3D object detection task and reformulate it as the sub-tasks of object localization and appearance perception, which benefits to a deep excavation of reciprocal information underlying the entire task. We introduce a Dynamic Feature Reflecting Network, named DFR-Net, which contains two novel standalone modules: (i) the Appearance-Localization Feature Reflecting module (ALFR) that first separates taskspecific features and then self-mutually reflects the reciprocal features; (ii) the Dynamic Intra-Trading module (DIT) that adaptively realigns the training processes of various sub-tasks via a self-learning manner. Extensive experiments on the challenging KITTI dataset demonstrate the effectiveness and generalization of DFR-Net. We rank 1st among all the monocular 3D object detectors in the KITTI test set (till March 16th, 2021). The proposed method is also easy to be plug-and-play in many cutting-edge 3D detection frameworks at negligible cost to boost performance. The code will be made publicly available.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| 3D Object Detection | KITTI car (test) | -- | 195 | |
| 3D Object Detection | KITTI Pedestrian (test) | AP3D (Easy)609 | 63 | |
| 3D Object Detection | KITTI car (val) | -- | 62 | |
| Bird's Eye View Object Detection (Car) | KITTI (test) | APBEV (Easy) @IoU=0.728.17 | 59 | |
| 3D Object Detection | KITTI Cyclist (test) | AP3D Easy5.69 | 49 | |
| 3D Object Detection (Car) | KITTI (test) | AP3D (Easy) @ IoU=0.719.4 | 36 | |
| 3D Object Detection | KITTI (test) | AP3D (Easy)19.4 | 26 | |
| Monocular 3D Object Detection | KITTI car (test) | AP3D R40 (Easy, IoU=0.7)19.4 | 19 | |
| Monocular 3D Object Detection (Car) | KITTI official (test) | AP3D (Easy)19.4 | 17 | |
| 3D Object Detection | KITTI Cyclist official (test) | 3D AP (Easy)5.69 | 8 |