| Task Name | Dataset Name | SOTA Result | Trend | |
|---|---|---|---|---|
| Voice Cloning | CV3-Eval multilingual voice cloning (test) | WER (zh)2.65 | 18 | |
| Speech Generation | CV3-Eval ko | CER3.9 | 7 | |
| Multilingual Voice Cloning | CV3-Eval Multilingual Voice Cloning (hard-en) | WER5.93 | 6 | |
| Zero-shot Text-to-Speech | CV3-Eval Multilingual Voice Cloning | English CER/WER4.87 | 5 | |
| Zero-Shot TTS | CV3-Eval Cross-Lingual Zero-Shot (to-ko) (test) | CER (English Prompt)6 | 4 | |
| Zero-Shot TTS | CV3-Eval Cross-Lingual Zero-Shot (to-ja) (test) | CER (English Prompt)14.12 | 4 | |
| Zero-Shot TTS | CV3-Eval Cross-Lingual Zero-Shot (to-en) (test) | WER (zh)5.07 | 4 | |
| Zero-Shot TTS | CV3-Eval Cross-Lingual Zero-Shot (to-zh) (test) | CER (English Prompt)7.16 | 4 |