Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Point Primitive Transformer for Long-Term 4D Point Cloud Video Understanding

About

This paper proposes a 4D backbone for long-term point cloud video understanding. A typical way to capture spatial-temporal context is using 4Dconv or transformer without hierarchy. However, those methods are neither effective nor efficient enough due to camera motion, scene changes, sampling patterns, and the complexity of 4D data. To address those issues, we leverage the primitive plane as a mid-level representation to capture the long-term spatial-temporal context in 4D point cloud videos and propose a novel hierarchical backbone named Point Primitive Transformer(PPTr), which is mainly composed of intra-primitive point transformers and primitive transformers. Extensive experiments show that PPTr outperforms the previous state of the arts on different tasks.

Hao Wen, Yunze Liu, Jingwei Huang, Bo Duan, Li Yi• 2022

Related benchmarks

TaskDatasetResultRank
Action RecognitionMSRAction3D
Accuracy92.33
176
Action RecognitionMSR Action3D (test)
Accuracy92.33
94
Action SegmentationHOI4D official (test)
Accuracy77.4
26
Semantic segmentationHOI4D 1.0 (test)
mIoU41
12
4D Action SegmentationHOI4D
Accuracy77.4
10
4D semantic segmentationHOI4D
mIoU41
10
Online Action SegmentationHOI4D
Accuracy69.7
9
Action RecognitionUTD-MHAD
Accuracy89.07
8
Showing 8 of 8 rows

Other info

Follow for update