| Dataset Name | SOTA Method | Metric | Trend | ||
|---|---|---|---|---|---|
| RealEdit | CosyEdit2 | WER4.31 | 15 | 8d ago | |
| Ming-Freeform-Audio-Edit English Substitution | CosyEdit2 | IMOS4.797 | 6 | 8d ago | |
| Ming-Freeform-Audio-Edit English Deletion | CosyEdit2 | IMOS4.66 | 6 | 8d ago | |
| Ming-Freeform-Audio-Edit English Insertion | CosyEdit2 | IMOS4.773 | 6 | 8d ago | |
| EARS-WHAM Long Replacement (noisy) | VoiceCraft | WER0.1 | 6 | 15d ago | |
| EARS-WHAM Short Replacement (noisy) | VoiceCraft | WER0.1 | 6 | 15d ago | |
| Ming-Freeform-Audio-Edit Chinese (Deletion) | CosyEdit2 | IMOS4.72 | 5 | 8d ago | |
| Ming-Freeform-Audio-Edit Chinese Substitution | DNSMOS (basic)3.16 | 5 | 8d ago | ||
| Ming-Freeform-Audio-Edit Chinese Deletion | DNSMOS (basic)3.22 | 5 | 8d ago | ||
| Ming-Freeform-Audio-Edit Chinese Insertion | CosyEdit2 | DNSMOS Score (Basic)3.2 | 5 | 8d ago | |
| EARS-WHAM Insertion (noisy) | VoiceCraft | WER0.18 | 4 | 15d ago | |
| LibriSpeech-Edit (test) | IndexTTS-2† | WER2.43 | 4 | 1mo ago | |
| Expressive Speech Editing | Vevo2 | WER16.83 | 4 | 2mo ago |