Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

SignX: Continuous Sign Recognition in Compact Pose-Rich Latent Space

About

The complexity of Sign Language (SL) data processing brings many challenges. The current approach to recognition of SL signs aims to translate RGB sign language videos through pose information into Word-based ID Glosses, which serve to uniquely identify signs. This paper proposes SignX, a novel framework for continuous sign language recognition (SLR) in compact pose-rich latent space. First, we construct a unified latent representation that encodes heterogeneous pose formats (SMPLer-X, DWPose, Mediapipe, PrimeDepth, and Sapiens Segmentation) into a compact, information-dense space. Second, we train a ViT-based Video-to-Pose module to extract this latent representation directly from raw videos. Finally, we develop a temporal modeling and sequence refinement method that operates entirely in this latent space. This multi-stage design achieves end-to-end SLR while significantly reducing computational consumption. Experimental results demonstrate that SignX achieves SOTA accuracy on continuous SLR and Translation task, delivering nearly a 50-fold acceleration over pixel-space baselines.

Sen Fang, Yalin Feng, Chunyu Sui, Hongbin Zhong, Yanxin Zhang, Hongwei Yi, Hezhen Hu, Dimitris N. Metaxas• 2025

Related benchmarks

TaskDatasetResultRank
Sign Language TranslationPHOENIX-2014T (test)
BLEU-429.91
183
Sign Language TranslationCSL-Daily (test)
BLEU-428.58
158
Sign Language TranslationPHOENIX-2014T (dev)
BLEU-4 Score30.08
147
Sign Language TranslationCSL-Daily (dev)
BLEU-428.75
115
Isolated Sign Language RecognitionWLASL 2000
P-I68.29
25
Continuous Sign Language RecognitionRWTH 2014-T
WER18.6
8
Continuous Sign Language RecognitionCSL-Daily
WER24.3
7
Sign Language Recognition (Sign2Gloss)ASLLRP (dev)
ROUGE56.65
4
Sign Language Recognition (Sign2Gloss)ASLLRP (test)
ROUGE56.48
4
Showing 9 of 9 rows

Other info

Follow for update