Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

VoxelNeXt: Fully Sparse VoxelNet for 3D Object Detection and Tracking

About

3D object detectors usually rely on hand-crafted proxies, e.g., anchors or centers, and translate well-studied 2D frameworks to 3D. Thus, sparse voxel features need to be densified and processed by dense prediction heads, which inevitably costs extra computation. In this paper, we instead propose VoxelNext for fully sparse 3D object detection. Our core insight is to predict objects directly based on sparse voxel features, without relying on hand-crafted proxies. Our strong sparse convolutional network VoxelNeXt detects and tracks 3D objects through voxel features entirely. It is an elegant and efficient framework, with no need for sparse-to-dense conversion or NMS post-processing. Our method achieves a better speed-accuracy trade-off than other mainframe detectors on the nuScenes dataset. For the first time, we show that a fully sparse voxel-based representation works decently for LIDAR 3D object detection and tracking. Extensive experiments on nuScenes, Waymo, and Argoverse2 benchmarks validate the effectiveness of our approach. Without bells and whistles, our model outperforms all existing LIDAR methods on the nuScenes tracking test benchmark.

Yukang Chen, Jianhui Liu, Xiangyu Zhang, Xiaojuan Qi, Jiaya Jia• 2023

Related benchmarks

TaskDatasetResultRank
3D Object DetectionnuScenes (val)
NDS68.7
941
3D Object DetectionnuScenes (test)
mAP66.2
829
3D Object DetectionNuScenes v1.0 (test)
mAP64.5
210
3D Object DetectionnuScenes v1.0 (val)
mAP (Overall)60.5
190
3D Object DetectionWaymo Open Dataset (val)
3D APH Vehicle L269.4
175
3D Multi-Object TrackingnuScenes (test)
ID Switches654
130
3D Object DetectionScanNet
mAP@0.2515.4
123
3D Multi-Object TrackingnuScenes (val)
AMOTA70.2
115
3D Object DetectionWaymo Open Dataset (test)
Vehicle L2 mAPH58.67
105
3D Object DetectionSUN RGB-D
mAP@0.2518.1
104
Showing 10 of 23 rows

Other info

Code

Follow for update