Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

PVT: Point-Voxel Transformer for Point Cloud Learning

About

The recently developed pure Transformer architectures have attained promising accuracy on point cloud learning benchmarks compared to convolutional neural networks. However, existing point cloud Transformers are computationally expensive since they waste a significant amount of time on structuring the irregular data. To solve this shortcoming, we present Sparse Window Attention (SWA) module to gather coarse-grained local features from non-empty voxels, which not only bypasses the expensive irregular data structuring and invalid empty voxel computation, but also obtains linear computational complexity with respect to voxel resolution. Meanwhile, to gather fine-grained features about the global shape, we introduce relative attention (RA) module, a more robust self-attention variant for rigid transformations of objects. Equipped with the SWA and RA, we construct our neural architecture called PVT that integrates both modules into a joint framework for point cloud learning. Compared with previous Transformer-based and attention-based models, our method attains top accuracy of 94.0% on classification benchmark and 10x inference speedup on average. Extensive experiments also valid the effectiveness of PVT on part and semantic segmentation benchmarks (86.6% and 69.2% mIoU, respectively).

Cheng Zhang, Haocheng Wan, Xinyi Shen, Zizhao Wu• 2021

Related benchmarks

TaskDatasetResultRank
Semantic segmentationS3DIS (Area 5)
mIOU68.21
799
Semantic segmentationSemanticKITTI (test)
mIoU64.9
335
Semantic segmentationS3DIS (6-fold)
mIoU (Mean IoU)69.2
315
Part SegmentationShapeNetPart (test)
mIoU (Inst.)86.6
312
Shape classificationModelNet40 (test)
OA94.1
255
Object ClassificationModelNet40 (test)
Accuracy93.7
180
ClassificationModelNet40 (test)
Accuracy93.6
99
Shape Part SegmentationShapeNet (test)
Mean IoU86.5
95
3D Point Cloud ClassificationModelNet40
Accuracy94.1
69
Part SegmentationShapeNet part
mIoU86.6
46
Showing 10 of 11 rows

Other info

Code

Follow for update