Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Voxel Field Fusion for 3D Object Detection

About

In this work, we present a conceptually simple yet effective framework for cross-modality 3D object detection, named voxel field fusion. The proposed approach aims to maintain cross-modality consistency by representing and fusing augmented image features as a ray in the voxel field. To this end, the learnable sampler is first designed to sample vital features from the image plane that are projected to the voxel grid in a point-to-ray manner, which maintains the consistency in feature representation with spatial context. In addition, ray-wise fusion is conducted to fuse features with the supplemental context in the constructed voxel field. We further develop mixed augmentor to align feature-variant transformations, which bridges the modality gap in data augmentation. The proposed framework is demonstrated to achieve consistent gains in various benchmarks and outperforms previous fusion-based methods on KITTI and nuScenes datasets. Code is made available at https://github.com/dvlab-research/VFF.

Yanwei Li, Xiaojuan Qi, Yukang Chen, Liwei Wang, Zeming Li, Jian Sun, Jiaya Jia• 2022

Related benchmarks

TaskDatasetResultRank
3D Object DetectionnuScenes (test)
mAP68.4
829
3D Instance SegmentationScanNet V2 (val)
Average AP5064.3
195
3D Instance SegmentationS3DIS (Area 5)
mAP@50% IoU59.3
106
3D Object DetectionKITTI (test)
3D AP Easy89.58
61
3D Object DetectionKITTI (val)--
24
3D Instance SegmentationScanNet (test)
mAP50.6
15
Showing 6 of 6 rows

Other info

Code

Follow for update