| Task Name | Dataset Name | SOTA Result | Trend | |
|---|---|---|---|---|
| Automatic Speech Recognition | LRS3-TED (test) | WER7.2 | 13 | |
| Automatic Speech Recognition | LRS3-TED Noisy (test) | WER27.7 | 9 | |
| Video-to-speech synthesis | LRS3-TED (test) | UTMOS4.031 | 7 | |
| Visual Speech Recognition | LRS3-TED Full (test) | WER55.1 | 2 |