PointPainting: Sequential Fusion for 3D Object Detection

About

Camera and lidar are important sensor modalities for robotics in general and self-driving cars in particular. The sensors provide complementary information offering an opportunity for tight sensor-fusion. Surprisingly, lidar-only methods outperform fusion methods on the main benchmark datasets, suggesting a gap in the literature. In this work, we propose PointPainting: a sequential fusion method to fill this gap. PointPainting works by projecting lidar points into the output of an image-only semantic segmentation network and appending the class scores to each point. The appended (painted) point cloud can then be fed to any lidar-only method. Experiments show large improvements on three different state-of-the art methods, Point-RCNN, VoxelNet and PointPillars on the KITTI and nuScenes datasets. The painted version of PointRCNN represents a new state of the art on the KITTI leaderboard for the bird's-eye view detection task. In ablation, we study how the effects of Painting depends on the quality and format of the semantic segmentation output, and demonstrate how latency can be minimized through pipelining.

Sourabh Vora, Alex H. Lang, Bassam Helou, Oscar Beijbom• 2019

Related benchmarks

Task	Dataset	Result
3D Object Detection	nuScenes (val)	NDS69.6	981
3D Object Detection	nuScenes (test)	mAP46.4	924
3D Object Detection	NuScenes v1.0 (test)	mAP54.1	239
Semantic segmentation	SemanticKITTI (val)	mIoU54.5	212
3D Object Detection	nuScenes v1.0-trainval (val)	NDS69.9	191
3D Semantic Segmentation	SemanticKITTI (val)	mIoU54.5	75
3D Object Detection	ONCE (val)	Overall mAP57.8	63
BEV Semantic Segmentation	nuScenes (val)	Drivable Area IoU75.9	55
3D Object Detection	KITTI (val)	mAP3D - Car (Easy)88.38	45
3D Object Detection	KITTI (test)	AP Car (IoU=0.7) Easy82.11	38

Showing 10 of 19 rows

Other info

Follow for update

@wizwand_team Discord