Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

LA-Sign: Looped Transformers with Geometry-aware Alignment for Skeleton-based Sign Language Recognition

About

Skeleton-based isolated sign language recognition (ISLR) demands fine-grained understanding of articulated motion across multiple spatial scales, from subtle finger movements to global body dynamics. Existing approaches typically rely on deep feed-forward architectures, which increase model capacity but lack mechanisms for recurrent refinement and structured representation. We propose LA-Sign, a looped transformer framework with geometry-aware alignment for ISLR. Instead of stacking deeper layers, LA-Sign derives its depth from recurrence, repeatedly revisiting latent representations to progressively refine motion understanding under shared parameters. To further regularise this refinement process, we present a geometry-aware contrastive objective that projects skeletal and textual features into an adaptive hyperbolic space, encouraging multi-scale semantic organisation. We study three looping designs and multiple geometric manifolds, demonstrating that encoder-decoder looping combined with adaptive Poincare alignment yields the strongest performance. Extensive experiments on WLASL and MSASL benchmarks show that LA-Sign achieves state-of-the-art results while using fewer unique layers, highlighting the effectiveness of recurrent latent refinement and geometry-aware representation learning for sign language recognition.

Muxin Pu, Mei Kuan Lim, Chun Yong Chong, Chen Change Loy• 2026

Related benchmarks

TaskDatasetResultRank
Sign Language RecognitionWLASL2000
P-I Accuracy64.73
15
Sign Language RecognitionMSASL 1000 (test)
Per-class Acc79.64
11
Sign Language RecognitionWLASL 300
P-I88.66
11
Sign Language RecognitionMSASL 200
Per-class Accuracy (P-C)94.1
11
Showing 4 of 4 rows

Other info

Follow for update