Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Post-training for Deepfake Speech Detection

About

We introduce a post-training approach that adapts self-supervised learning (SSL) models for deepfake speech detection by bridging the gap between general pre-training and domain-specific fine-tuning. We present AntiDeepfake models, a series of post-trained models developed using a large-scale multilingual speech dataset containing over 56,000 hours of genuine speech and 18,000 hours of speech with various artifacts in over one hundred languages. Experimental results show that the post-trained models already exhibit strong robustness and generalization to unseen deepfake speech. When they are further fine-tuned on the Deepfake-Eval-2024 dataset, these models consistently surpass existing state-of-the-art detectors that do not leverage post-training. Model checkpoints and source code are available online.

Wanying Ge, Xin Wang, Xuechen Liu, Junichi Yamagishi• 2025

Related benchmarks

TaskDatasetResultRank
Audio Deepfake Detectionin the wild
EER1.23
58
Audio Deepfake DetectionITW
ACC98.7
15
Speech Deepfake DetectionFakeOrReal
EER173
9
Speech Deepfake DetectionODSS
EER (%)1.13
7
Speech Deepfake DetectionEF
EER20
7
Speech Deepfake DetectionADD ASVspoof 2022
EER1.05
7
Speech Deepfake DetectionADD ASVspoof 2023
EER4.67
7
Speech Deepfake DetectionDV
EER2.27
7
Speech Deepfake DetectionFSW
EER (%)16.15
7
Speech Deepfake DetectionFoR
Accuracy98.05
6
Showing 10 of 16 rows

Other info

Follow for update