| LibriTTS clean (test) | | PESQ4.644 | | 63 | 17d ago |
| LibriSpeech (test-clean) | StableCodec | UT MOS4.23 | | 59 | 1mo ago |
| AISHELL-2 Chinese | MOSS-Audio-Tokenizer | SIM0.93 | | 54 | 29d ago |
| LibriSpeech English (test-clean) | MOSS-Audio-Tokenizer | SIM0.97 | | 54 | 29d ago |
| LibriTTS (test-other) | Sylber | UTMOS3.91 | | 44 | 1mo ago |
| SEED-ZH | MingTok-Audio | PESQ4.21 | | 21 | 22d ago |
| Chinese speech | SAC | UTMOS2.99 | | 19 | 1mo ago |
| English speech | WavTokenizer | UTMOS3.92 | | 19 | 1mo ago |
| LibriSpeech clean (test) | | WER1.9 | | 19 | 3d ago |
| SeedTTS en (test) | | WER0.0214 | | 18 | 1mo ago |
| Salmon Sentiment Consistency emotional 2025b (OOD) | | WER2.9 | | 18 | 1mo ago |
| SEED-EN | H-Codec-2.0 (Large) | PESQ2.77 | | 12 | 1mo ago |
| Open Track 2 (test) | Baseline | ScoreQ-ref1.15 | | 12 | 1mo ago |
| Open Track 1 (test) | Baseline | ScoreQ-ref1.36 | | 12 | 1mo ago |
| Seed-TTS English | DashengTokenizer | PESQ4.125 | | 9 | 22d ago |
| MLS (Multilingual LibriSpeech) Non-English (test) | Mimi-32 | WER7.3 | | 9 | 1mo ago |
| English Read by Japanese accented speech 2007 (OOD) | | WER14.9 | | 9 | 1mo ago |
| Japanese Versatile Speech unseen language speech 2019 (OOD) | | WER4.6 | | 9 | 1mo ago |
| Gigaspeech noisy speech 2021 (OOD) | | WER9.7 | | 9 | 1mo ago |
| Librispeech (test) | MSR-Codec-612 | STOI0.9 | | 8 | 1mo ago |
| LibriTTS (test) | VCNAC | PESQ4.16 | | 7 | 1mo ago |
| VCTK subset | ReasoningCodec | PESQ (WB)2.36 | | 7 | 1mo ago |
| GRID (speaker-dependent) | Proposed Method | STOI0.738 | | 7 | 1mo ago |
| Full-band SR 48kHz | Re. 48kHz HuBERT Base | STOI85.93 | | 6 | 24d ago |
| Full-band SR 24kHz | Re. 24kHz HuBERT Base | STOI89.34 | | 6 | 24d ago |