Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Multilingual Gloss-free Sign Language Translation: Towards Building a Sign Language Foundation Model

About

Sign Language Translation (SLT) aims to convert sign language (SL) videos into spoken language text, thereby bridging the communication gap between the sign and the spoken community. While most existing works focus on translating a single sign language into a single spoken language (one-to-one SLT), leveraging multilingual resources could mitigate low-resource issues and enhance accessibility. However, multilingual SLT (MLSLT) remains unexplored due to language conflicts and alignment difficulties across SLs and spoken languages. To address these challenges, we propose a multilingual gloss-free model with dual CTC objectives for token-level SL identification and spoken text generation. Our model supports 10 SLs and handles one-to-one, many-to-one, and many-to-many SLT tasks, achieving competitive performance compared to state-of-the-art methods on three widely adopted benchmarks: multilingual SP-10, PHOENIX14T, and CSL-Daily.

Sihan Tan, Taro Miyazaki, Kazuhiro Nakadai• 2025

Related benchmarks

TaskDatasetResultRank
Sign Language TranslationPHOENIX-2014T (dev)--
111
Sign Language TranslationCSL-Daily (test)
BLEU-414.18
99
Sign Language TranslationCSL-Daily (dev)
ROUGE39.33
80
Sign Language TranslationPHOENIX14T (test)
BLEU-424.23
50
Sign Language TranslationSP-10 1.0 (dev)
BLEU8.79
10
Sign Language TranslationSP-10 1.0 (test)
BLEU7.32
10
Many-to-one Sign Language TranslationSP-10 (dev)
csl BLEU7.06
5
Many-to-one Sign Language TranslationSP-10 (test)
csl BLEU5.92
5
Showing 8 of 8 rows

Other info

Code

Follow for update