Benchmarking Audio Deepfake Detection Robustness in Real-world Communication Scenarios

About

Existing Audio Deepfake Detection (ADD) systems often struggle to generalise effectively due to the significantly degraded audio quality caused by audio codec compression and channel transmission effects in real-world communication scenarios. To address this challenge, we developed a rigorous benchmark to evaluate the performance of the ADD system under such scenarios. We introduced ADD-C, a new test dataset to evaluate the robustness of ADD systems under diverse communication conditions, including different combinations of audio codecs for compression and packet loss rates. Benchmarking three baseline ADD models on the ADD-C dataset demonstrated a significant decline in robustness under such conditions. A novel Data Augmentation (DA) strategy was proposed to improve the robustness of ADD systems. Experimental results demonstrated that the proposed approach significantly enhances the performance of ADD systems on the proposed ADD-C dataset. Our benchmark can assist future efforts towards building practical and robustly generalisable ADD systems.

Haohan Shi, Xiyu Shi, Safak Dogan, Saif Alzubi, Tianjin Huang, Yunxiao Zhang• 2025

Related benchmarks

Task	Dataset	Result
Audio Deepfake Detection	ADD-C 1.0s duration (test)	C0 Score9.39	12
Audio Deepfake Detection	ADD-C 2.0s duration (test)	Class 0 Score4.98	12
Audio Deepfake Detection	ADD-C 0.5s duration (test)	C0 Score13.44	12
Audio Deepfake Detection	ADD-C 1.5s duration (test)	C0 Score6.03	12

Showing 4 of 4 rows

Other info

Follow for update

@wizwand_team Discord