Fast Point R-CNN

About

We present a unified, efficient and effective framework for point-cloud based 3D object detection. Our two-stage approach utilizes both voxel representation and raw point cloud data to exploit respective advantages. The first stage network, with voxel representation as input, only consists of light convolutional operations, producing a small number of high-quality initial predictions. Coordinate and indexed convolutional feature of each point in initial prediction are effectively fused with the attention mechanism, preserving both accurate localization and context information. The second stage works on interior points with their fused feature for further refining the prediction. Our method is evaluated on KITTI dataset, in terms of both 3D and Bird's Eye View (BEV) detection, and achieves state-of-the-arts with a 15FPS detection rate.

Yilun Chen, Shu Liu, Xiaoyong Shen, Jiaya Jia• 2019

Related benchmarks

Task	Dataset	Result
3D Object Detection	KITTI car (test)	AP3D (Easy)85.29	226
3D Object Detection	KITTI (test)	AP_3D (Easy)84.28	83
Bird's Eye View Detection	KITTI Car class official (test)	AP (Easy)90.87	62
3D Object Detection	KITTI (test)	AP_3D Car (Easy)85.29	60
3D Object Detection	KITTI cars (val)	AP Easy89.12	48
BEV Object Detection	KITTI (test)	AP (Easy)88.03	47
3D Object Detection	KITTI (val)	AP3D Easy89.12	36
Bird's Eye View Detection	KITTI (val)	--	36
3D Object Detection	KITTI (test)	AP (Easy)85.29	27
BEV Detection	KITTI car (test)	mAP (Easy)90.87	14

Showing 10 of 12 rows

Other info

Follow for update

@wizwand_team Discord