Speech DF Arena: A Leaderboard for Speech DeepFake Detection Models
About
Parallel to the development of advanced deepfake audio generation, audio deepfake detection has also seen significant progress. However, a standardized and comprehensive benchmark is still missing. To address this, we introduce Speech DeepFake (DF) Arena, the first comprehensive benchmark for audio deepfake detection. Speech DF Arena provides a toolkit to uniformly evaluate detection systems, currently across 14 diverse datasets and attack scenarios, standardized evaluation metrics and protocols for reproducibility and transparency. It also includes a leaderboard to compare and rank the systems to help researchers and developers enhance their reliability and robustness. We include 14 evaluation sets, 12 state-of-the-art open-source and 3 proprietary detection systems. Our study presents many systems exhibiting high EER in out-of-domain scenarios, highlighting the need for extensive cross-domain evaluation. The leaderboard is hosted on Huggingface1 and a toolkit for reproducing results across the listed datasets is available on GitHub.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Audio Deepfake Detection | ASVspoof DF 2021 | EER3.3 | 47 | |
| Audio Deepfake Detection | ASVspoof LA 2021 | EER4.23 | 41 | |
| Audio Deepfake Detection | FoR | EER2.3 | 27 | |
| Audio Deepfake Detection | CodecFake | EER6.36 | 19 | |
| Audio Deepfake Detection | ADD Track 3 2022 | EER2.77 | 19 | |
| Audio Deepfake Detection | ADD 2023 R1 | EER7.47 | 19 | |
| Audio Deepfake Detection | ADD 2023 R2 | EER12.3 | 19 | |
| Audio Deepfake Detection | SONAR | EER1.9 | 19 | |
| Audio Deepfake Detection | ADD Track 1 2022 | EER23.98 | 19 | |
| Audio Deepfake Detection | ASVspoof 2024 | EER12.39 | 16 |