| Dataset Name | SOTA Method | Metric | Trend | ||
|---|---|---|---|---|---|
| VCTK 100 audio clips (unseen) | BigVGAN | MAE0.0925 | 10 | 4d ago | |
| LibriTTS clean (dev) | BigVGAN | MAE0.0931 | 10 | 4d ago | |
| LJSpeech | DiffWave | MOS4.49 | 9 | 4d ago | |
| LibriTTS | UTMOS4.058 | 8 | 3d ago | ||
| VCTK (unseen speakers) | MOS4.37 | 8 | 4d ago | ||
| LJSpeech and VCTK | MOS4.6 | 6 | 4d ago | ||
| Inference Speed Benchmark batch size 16, 1s samples | BigVGAN | xRT (GPU)98.61 | 5 | 3d ago | |
| MUSDB18 (out-of-distribution) | Vocos | Mixture Score4.61 | 4 | 4d ago |