Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Fast Point Transformer

About

The recent success of neural networks enables a better interpretation of 3D point clouds, but processing a large-scale 3D scene remains a challenging problem. Most current approaches divide a large-scale scene into small regions and combine the local predictions together. However, this scheme inevitably involves additional stages for pre- and post-processing and may also degrade the final output due to predictions in a local perspective. This paper introduces Fast Point Transformer that consists of a new lightweight self-attention layer. Our approach encodes continuous 3D coordinates, and the voxel hashing-based architecture boosts computational efficiency. The proposed method is demonstrated with 3D semantic segmentation and 3D detection. The accuracy of our approach is competitive to the best voxel-based method, and our network achieves 129 times faster inference time than the state-of-the-art, Point Transformer, with a reasonable accuracy trade-off in 3D semantic segmentation on S3DIS dataset.

Chunghyun Park, Yoonwoo Jeong, Minsu Cho, Jaesik Park• 2021

Related benchmarks

TaskDatasetResultRank
Semantic segmentationS3DIS (Area 5)
mIOU71
799
Semantic segmentationScanNet V2 (val)
mIoU72.4
288
3D Semantic SegmentationScanNet V2 (val)
mIoU72.1
171
3D Semantic SegmentationScanNet (val)--
100
3D Object DetectionScanNet (val)
mAP@0.2559.1
66
3D Semantic SegmentationS3DIS Area 5 (test)
mIoU (%)70.3
32
3D Semantic SegmentationScanNet20 v2 (val)
mIoU72.1
13
Showing 7 of 7 rows

Other info

Code

Follow for update