| Dataset Name | SOTA Method | Metric | Trend | ||
|---|---|---|---|---|---|
| VCTK | WER0 | 21 | 4d ago | ||
| LibriSpeech (test-clean) | GT (Vocoder) | WER2.87 | 13 | 4d ago | |
| ELHE HE portion (160 unmodified utterances) | CER2.9 | 11 | 4d ago | ||
| LibriTTS (test-clean) | WER2.04 | 11 | 4d ago | ||
| VCTK (test) | nMOS4.26 | 9 | 4d ago | ||
| Seed-TTS English | CosyVoice-VC | SECS0.9129 | 7 | 4d ago | |
| Elliot Miller target speaker | WER3.22 | 7 | 4d ago | ||
| LJSpeech target speaker | WER3.22 | 7 | 4d ago | ||
| Expresso OOD | F0 Correlation0.543 | 6 | 4d ago | ||
| TIMIT OOD | F0 Correlation0.484 | 6 | 4d ago | ||
| Voice Conversion (VC) Benchmark | WER3.25 | 6 | 4d ago | ||
| LibriTTS unseen-to-unseen (test-clean) | MOS4.27 | 6 | 4d ago | ||
| LibriTTS to VCTK (unseen-to-seen) (test-clean) | MOS4.29 | 6 | 4d ago | ||
| VCTK seen-to-seen (test) | MOS4.32 | 6 | 4d ago | ||
| ZeroSpeech Indonesian 2019 (test) | Chen and Hain | CER15 | 6 | 4d ago | |
| ZeroSpeech English 2019 (test) | Chen and Hain | CER18 | 6 | 4d ago | |
| CMU Arctic clb to slt | SpeechT5 | MCD5.87 | 5 | 4d ago | |
| CMU Arctic bdl to slt | SpeechT5 | MCD5.93 | 5 | 4d ago | |
| Voice Conversion (VC) Zero-shot | SeedVC | UTMOS4.04 | 4 | 4d ago | |
| ESD | SR | WER0.149 | 4 | 4d ago |