| Task Name | Dataset Name | SOTA Result | Trend | |
|---|---|---|---|---|
| Speaker Recognition | VoxCeleb1 (test) | EER2.21 | 126 | |
| Speaker Verification | VoxCeleb1 (test) | Cosine EER0.856 | 80 | |
| Speaker Identification | VoxCeleb1 | Accuracy97.5 | 58 | |
| Speaker Verification | VoxCeleb1 Hard Cleaned | EER0.0099 | 45 | |
| Speaker Verification | VoxCeleb1 Cleaned (Extended) | EER (%)0.48 | 45 | |
| Speaker Verification | VoxCeleb1 (Vox1-O) | EER0.627 | 33 | |
| Speaker Verification | VoxCeleb1 extended (test) | EER1.07 | 25 | |
| Speaker verification | VoxCeleb 1 (verification) | EER0.48 | 22 | |
| Speaker Verification | VoxCeleb1 (Vox1-H) | EER0.986 | 20 | |
| Face Reenactment | VoxCeleb2 (test) | FID24.92 | 16 | |
| Face Reenactment | VoxCeleb1 (test) | SSIM0.804 | 16 | |
| Speaker Verification | VoxCeleb Hard 1 | EER (f-f)1.87 | 15 | |
| Speaker Verification | VoxCeleb Extended 1 | EER (f-f)1 | 15 | |
| Speaker Verification | VoxCeleb-E | EER (f-f)0.97 | 15 | |
| Cross-modal verification | VoxCeleb1 (Unseen-Unheard) | AUC85 | 13 | |
| Speech Separation | VoxCeleb2-2Mix (test) | SDRi13.1 | 12 | |
| Speaker Recognition | VoxCeleb1 extended (vox1-e) | EER (mean)0.9 | 11 | |
| Speaker Recognition | VoxCeleb1 original (vox1-o) | EER (mean)0.74 | 11 | |
| Imperceptibility | VoxCeleb2 | SSIM0.961 | 10 | |
| Speaker Verification | VoxCeleb 1hr context Normal | EER0.0094 | 10 | |
| Speaker Verification | VoxCeleb 10min context Normal | EER1.04 | 10 | |
| Neural Field Reconstruction | VoxCeleb2 | PSNR (Step 1)29.84 | 9 | |
| Video self-reconstruction | Voxceleb1 (test) | L1 Loss0.0354 | 9 | |
| Cross-identity face animation | Voxceleb 1 | ARD2.399 | 9 | |
| Cross-modal verification | VoxCeleb1 (Seen-Heard) | AUC0.937 | 9 |