Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Resource-Efficient RGB-Only Action Recognition for Edge Deployment

About

Action recognition on edge devices poses stringent constraints on latency, memory, storage, and power consumption. While auxiliary modalities such as skeleton and depth information can enhance recognition performance, they often require additional sensors or computationally expensive pose-estimation pipelines, limiting practicality for edge use. In this work, we propose a compact RGB-only network tailored for efficient on-device inference. Our approach builds upon an X3D-style backbone augmented with Temporal Shift, and further introduces selective temporal adaptation and parameter-free attention. Extensive experiments on the NTU RGB+D 60 and 120 benchmarks demonstrate a strong accuracy-efficiency balance. Moreover, deployment-level profiling on the Jetson Orin Nano verifies a smaller on-device footprint and practical resource utilization compared to existing RGB-based action recognition techniques.

Dongsik Yoon, Jongeun Kim, Dayeon Lee• 2026

Related benchmarks

TaskDatasetResultRank
Action RecognitionNTU RGB+D 120 (X-set)
Accuracy92.67
661
Action RecognitionNTU RGB+D 60 (Cross-View)
Accuracy98.31
575
Action RecognitionNTU RGB+D 60 (X-sub)
Accuracy95.21
467
Action RecognitionNTU RGB+D X-sub 120
Accuracy90.88
377
Action RecognitionNTU RGB+D 60 120 (offline evaluation)
Accuracy98.3
4
Showing 5 of 5 rows

Other info

Follow for update