Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Natural Language-Assisted Sign Language Recognition

About

Sign languages are visual languages which convey information by signers' handshape, facial expression, body movement, and so forth. Due to the inherent restriction of combinations of these visual ingredients, there exist a significant number of visually indistinguishable signs (VISigns) in sign languages, which limits the recognition capacity of vision neural networks. To mitigate the problem, we propose the Natural Language-Assisted Sign Language Recognition (NLA-SLR) framework, which exploits semantic information contained in glosses (sign labels). First, for VISigns with similar semantic meanings, we propose language-aware label smoothing by generating soft labels for each training sign whose smoothing weights are computed from the normalized semantic similarities among the glosses to ease training. Second, for VISigns with distinct semantic meanings, we present an inter-modality mixup technique which blends vision and gloss features to further maximize the separability of different signs under the supervision of blended labels. Besides, we also introduce a novel backbone, video-keypoint network, which not only models both RGB videos and human body keypoints but also derives knowledge from sign videos of different temporal receptive fields. Empirically, our method achieves state-of-the-art performance on three widely-adopted benchmarks: MSASL, WLASL, and NMFs-CSL. Codes are available at https://github.com/FangyunWei/SLRT.

Ronglai Zuo, Fangyun Wei, Brian Mak• 2023

Related benchmarks

TaskDatasetResultRank
Isolated Sign Language RecognitionWLASL 100
Per-instance Top-1 Acc91.47
46
Isolated Sign Language RecognitionWLASL 300--
28
Isolated Sign Language RecognitionMSASL 1000
Per-class Top-1 Acc69.86
25
Isolated Sign Language RecognitionMSASL 100
Per-class Top-1 Acc91.04
24
Isolated Sign Language RecognitionMSASL200
Top-1 Acc (Class)89.23
23
Isolated Sign Language RecognitionWLASL 2000
P-I61.05
17
Sign Language RecognitionWLASL (test)
Top-1 Accuracy61.3
17
Sign Language RecognitionWLASL2000 v1.0 (test)
Per-instance Top-1 Acc0.6126
12
Sign Language RecognitionWLASL 100 v1.0 (test)--
10
Sign Language RecognitionWLASL300 v1.0 (test)
Top-1 Accuracy (Per-instance)86.98
9
Showing 10 of 17 rows

Other info

Code

Follow for update