Integrated Spoofing-Robust Automatic Speaker Verification via a Three-Class Formulation and LLR
About
Spoofing-robust automatic speaker verification (SASV) aims to integrate automatic speaker verification (ASV) and countermeasure (CM). A popular solution is fusion of independent ASV and CM scores. To better modeling SASV, some frameworks integrate ASV and CM within a single network. However, these solutions are typically bi-encoder based, offer limited interpretability, and cannot be readily adapted to new evaluation parameters without retraining. Based on this, we propose a unified end-to-end framework via a three-class formulation that enables log-likelihood ratio (LLR) inference from class logits for a more interpretable decision pipeline. Experiments show comparable performance to existing methods on ASVSpoof5 and better results on SpoofCeleb. The visualization and analysis also prove that the three-class reformulation provides more interpretability.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Spoofing-aware speaker verification | SpoofCeleb (eval set) | a-DCF0.1205 | 17 | |
| Spoofing-aware speaker verification | ASVspoof 5 (evaluation) | Min a-DCF0.5559 | 12 | |
| Spoofing-aware speaker verification | WildSpoof TTS | Min a-DCF0.4932 | 4 | |
| Spoofing-aware speaker verification | SpoofCeleb | min a-DCF0.2624 | 4 | |
| Spoofing-aware speaker verification | WildSpoof | Macro min a-DCF0.4264 | 4 | |
| Spoofing-aware speaker verification | SASV 2022 | Min a-DCF (SASV 2022)0.491 | 4 |