One-Class Knowledge Distillation for Spoofing Speech Detection
About
The detection of spoofing speech generated by unseen algorithms remains an unresolved challenge. One reason for the lack of generalization ability is traditional detecting systems follow the binary classification paradigm, which inherently assumes the possession of prior knowledge of spoofing speech. One-class methods attempt to learn the distribution of bonafide speech and are inherently suited to the task where spoofing speech exhibits significant differences. However, training a one-class system using only bonafide speech is challenging. In this paper, we introduce a teacher-student framework to provide guidance for the training of a one-class model. The proposed one-class knowledge distillation method outperforms other state-of-the-art methods on the ASVspoof 21DF dataset and InTheWild dataset, which demonstrates its superior generalization ability.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Audio Deepfake Detection | in the wild | EER7.68 | 58 | |
| Audio Deepfake Detection | ASVspoof DF 2021 | EER2.27 | 35 | |
| Audio Deepfake Detection | ASVspoof LA 2021 | EER0.9 | 23 | |
| Audio Deepfake Detection | ASVspoof LA and DF 2021 | EER (DF)2.27 | 17 | |
| Deepfake Audio Detection | ASVspoof LA 2019 | EER (%)39 | 12 |