Physics-Guided Deepfake Detection for Voice Authentication Systems

About

Voice authentication systems deployed at the network edge face dual threats: a) sophisticated deepfake synthesis attacks and b) control-plane poisoning in distributed federated learning protocols. We present a framework coupling physics-guided deepfake detection with uncertainty-aware in edge learning. The framework fuses interpretable physics features modeling vocal tract dynamics with representations coming from a self-supervised learning module. The representations are then processed via a Multi-Modal Ensemble Architecture, followed by a Bayesian ensemble providing uncertainty estimates. Incorporating physics-based characteristics evaluations and uncertainty estimates of audio samples allows our proposed framework to remain robust to both advanced deepfake attacks and sophisticated control-plane poisoning, addressing the complete threat model for networked voice authentication.

Alireza Mohammadi, Keshav Sood, Dhananjay Thiruvady, Asef Nazari• 2025

Related benchmarks

Task	Dataset	Result
Deepfake Audio Detection	ASVspoof LA 2019	EER (%)6.8	12
Deepfake Audio Detection	ASVspoof LA 2021	EER0.0905	1
Deepfake Audio Detection	ASVspoof PA 2019	EER (%)12.95	1
Deepfake Audio Detection	ASVspoof PA 2021	EER15.05	1

Showing 4 of 4 rows

Other info

Follow for update

@wizwand_team Discord