Decoupled Pseudo-labeling for Semi-Supervised Monocular 3D Object Detection
About
We delve into pseudo-labeling for semi-supervised monocular 3D object detection (SSM3OD) and discover two primary issues: a misalignment between the prediction quality of 3D and 2D attributes and the tendency of depth supervision derived from pseudo-labels to be noisy, leading to significant optimization conflicts with other reliable forms of supervision. We introduce a novel decoupled pseudo-labeling (DPL) approach for SSM3OD. Our approach features a Decoupled Pseudo-label Generation (DPG) module, designed to efficiently generate pseudo-labels by separately processing 2D and 3D attributes. This module incorporates a unique homography-based method for identifying dependable pseudo-labels in BEV space, specifically for 3D attributes. Additionally, we present a DepthGradient Projection (DGP) module to mitigate optimization conflicts caused by noisy depth supervision of pseudo-labels, effectively decoupling the depth gradient and removing conflicting gradients. This dual decoupling strategy-at both the pseudo-label generation and gradient levels-significantly improves the utilization of pseudo-labels in SSM3OD. Our comprehensive experiments on the KITTI benchmark demonstrate the superiority of our method over existing approaches.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| 3D Object Detection | nuScenes (val) | NDS0.437 | 941 | |
| 3D Object Detection | KITTI car (test) | -- | 195 | |
| Bird's Eye View (BEV) Detection | KITTI Cars (IoU3D ≥ 0.7) (test) | APBEV R40 (Easy)33.16 | 52 | |
| Monocular 3D Object Detection | KITTI (test) | AP3D R40 (Mod.)16.67 | 38 | |
| Monocular 3D Object Detection | KITTI car category (val) | AP 3D (R40)19.84 | 37 |