Pseudo-Stereo for Monocular 3D Object Detection in Autonomous Driving

About

Pseudo-LiDAR 3D detectors have made remarkable progress in monocular 3D detection by enhancing the capability of perceiving depth with depth estimation networks, and using LiDAR-based 3D detection architectures. The advanced stereo 3D detectors can also accurately localize 3D objects. The gap in image-to-image generation for stereo views is much smaller than that in image-to-LiDAR generation. Motivated by this, we propose a Pseudo-Stereo 3D detection framework with three novel virtual view generation methods, including image-level generation, feature-level generation, and feature-clone, for detecting 3D objects from a single image. Our analysis of depth-aware learning shows that the depth loss is effective in only feature-level virtual view generation and the estimated depth map is effective in both image-level and feature-level in our framework. We propose a disparity-wise dynamic convolution with dynamic kernels sampled from the disparity feature map to filter the features adaptively from a single image for generating virtual image features, which eases the feature degradation caused by the depth estimation errors. Till submission (November 18, 2021), our Pseudo-Stereo 3D detection framework ranks 1st on car, pedestrian, and cyclist among the monocular 3D detectors with publications on the KITTI-3D benchmark. The code is released at https://github.com/revisitq/Pseudo-Stereo-3D.

Yi-Nan Chen, Hang Dai, Yong Ding• 2022

Related benchmarks

Task	Dataset	Result
3D Object Detection	KITTI (val)	AP3D (Moderate)24.15	85
3D Object Detection	KITTI Pedestrian (test)	AP3D (Easy)1.70e+3	75
3D Object Detection	KITTI (test)	--	60
Bird's eye view object detection	KITTI (test)	APBEV@0.7 (Easy)32.64	53
3D Object Detection	KITTI official (test)	3D AP (Easy)23.74	43
3D Object Detection	KITTI official (val)	AP40 Easy35.18	31
BEV Object Detection	KITTI official (test)	AP40 Easy32.84	22
Monocular 3D Object Detection (Car)	KITTI official (test)	AP3D (Easy)23.74	17
Pedestrian Detection	KITTI (test)	AP BEV (Easy)12.8	9

Showing 9 of 9 rows

Other info

Code

Follow for update

@wizwand_team Discord