| Dataset Name | SOTA Method | Metric | Trend | ||
|---|---|---|---|---|---|
| CV3-Eval multilingual voice cloning (test) | Fish Audio S2 | WER (zh)2.65 | 18 | 2mo ago | |
| SEED-TTS EN (test) | Fish Audio S2 | WER0.99 | 16 | 2mo ago | |
| LibriHeavy-HQ | LM-Backbone LoRA for TTS | SNR40.778 | 9 | 2mo ago | |
| HiFi TTS | LM-Backbone LoRA for TTS | SNR55.854 | 9 | 2mo ago | |
| SEED-TTS-Eval ZH (test) | Index-TTS 2 | CER1.03 | 8 | 3mo ago | |
| Common Voice English | Chroma 1.0 | SIM Score0.81 | 7 | 3mo ago | |
| Thai Voice Cloning Short: 1-15s | JaiTTS-v1.0 | CER (%)1.94 | 5 | 1mo ago | |
| blindset-4 French (test) | Qwen3-TTS | WER0.05 | 5 | 1mo ago | |
| Thai Voice Cloning evaluation set | JaiTTS-v1.0 | RTF0.1136 | 4 | 1mo ago | |
| Thai Voice Cloning Long: 16-30s | CER2.47 | 4 | 1mo ago | ||
| blindset Chinese 4 (test) | Qwen3-TTS | CER0.09 | 4 | 1mo ago | |
| blindset-4 Arabic (test) | VoxCPM2 | WER0.209 | 4 | 1mo ago | |
| voices (unseen) | minimind-3o-moe | Similarity (CAM++)0.5702 | 3 | 28d ago | |
| voices.pt (seen) | minimind-3o | Similarity (CAM++)64.72 | 3 | 28d ago | |
| Voice-cloning (overall) | minimind-3o | Similarity (CAM++)59.95 | 2 | 28d ago | |
| LRS3 (test) | MOS4.15 | 2 | 3mo ago | ||
| Voice cloning speakers (test) | - | - | 0 | 3mo ago |