Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Real-time 3D human action recognition based on Hyperpoint sequence

About

Real-time 3D human action recognition has broad industrial applications, such as surveillance, human-computer interaction, and healthcare monitoring. By relying on complex spatio-temporal local encoding, most existing point cloud sequence networks capture spatio-temporal local structures to recognize 3D human actions. To simplify the point cloud sequence modeling task, we propose a lightweight and effective point cloud sequence network referred to as SequentialPointNet for real-time 3D action recognition. Instead of capturing spatio-temporal local structures, SequentialPointNet encodes the temporal evolution of static appearances to recognize human actions. Firstly, we define a novel type of point data, Hyperpoint, to better describe the temporally changing human appearances. A theoretical foundation is provided to clarify the information equivalence property for converting point cloud sequences into Hyperpoint sequences. Secondly, the point cloud sequence modeling task is decomposed into a Hyperpoint embedding task and a Hyperpoint sequence modeling task. Specifically, for Hyperpoint embedding, the static point cloud technology is employed to convert point cloud sequences into Hyperpoint sequences, which introduces inherent frame-level parallelism; for Hyperpoint sequence modeling, a Hyperpoint-Mixer module is designed as the basic building block to learning the spatio-temporal features of human actions. Extensive experiments on three widely-used 3D action recognition datasets demonstrate that the proposed SequentialPointNet achieves competitive classification performance with up to 10X faster than existing approaches.

Xing Li, Qian Huang, Zhijian Wang, Zhenjie Hou, Tianjin Yang, Zhuang Miao• 2021

Related benchmarks

TaskDatasetResultRank
Action RecognitionNTU RGB+D 120 (X-set)
Accuracy95.4
770
Action RecognitionNTU RGB+D 60 (Cross-View)
Accuracy97.6
601
Action RecognitionNTU RGB-D Cross-Subject 60
Accuracy90.3
358
Action RecognitionNTU RGB+D 120 Cross-Subject
Accuracy83.5
241
Action RecognitionMSRAction3D
Accuracy92.64
176
Action RecognitionNTU RGB+D
Accuracy90.3
50
Action RecognitionUTD-MHAD
Accuracy92.31
8
Showing 7 of 7 rows

Other info

Follow for update