Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Spatial-Temporal Multi-Cue Network for Continuous Sign Language Recognition

About

Despite the recent success of deep learning in continuous sign language recognition (CSLR), deep models typically focus on the most discriminative features, ignoring other potentially non-trivial and informative contents. Such characteristic heavily constrains their capability to learn implicit visual grammars behind the collaboration of different visual cues (i,e., hand shape, facial expression and body posture). By injecting multi-cue learning into neural network design, we propose a spatial-temporal multi-cue (STMC) network to solve the vision-based sequence learning problem. Our STMC network consists of a spatial multi-cue (SMC) module and a temporal multi-cue (TMC) module. The SMC module is dedicated to spatial representation and explicitly decomposes visual features of different cues with the aid of a self-contained pose estimation branch. The TMC module models temporal correlations along two parallel paths, i.e., intra-cue and inter-cue, which aims to preserve the uniqueness and explore the collaboration of multiple cues. Finally, we design a joint optimization strategy to achieve the end-to-end sequence learning of the STMC network. To validate the effectiveness, we perform experiments on three large-scale CSLR benchmarks: PHOENIX-2014, CSL and PHOENIX-2014-T. Experimental results demonstrate that the proposed method achieves new state-of-the-art performance on all three benchmarks.

Hao Zhou, Wengang Zhou, Yun Zhou, Houqiang Li• 2020

Related benchmarks

TaskDatasetResultRank
Continuous Sign Language RecognitionPHOENIX 2014 (dev)
Word Error Rate19.6
188
Continuous Sign Language RecognitionPHOENIX-2014 (test)
WER20.7
185
Continuous Sign Language RecognitionPHOENIX14-T (dev)
WER19.6
75
Continuous Sign Language RecognitionCSL (test)
WER2.1
23
Continuous Sign Language RecognitionCSL
WER2.1
23
Continuous Sign Language RecognitionPHOENIX 14 (dev test)
WER (Dev)21.1
16
Continuous Sign Language RecognitionPHOENIX14-T (dev test)
WER (Dev)19.6
14
Continuous Sign Language RecognitionPHOENIX14-T 2018 (dev)
WER19.6
13
Continuous Sign Language RecognitionPHOENIX14-T 2018 (test)
WER21
13
Continuous Sign Language RecognitionPHOENIX14 2015 (dev)
Deletion Rate7.7
13
Showing 10 of 14 rows

Other info

Follow for update