Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

RawBMamba: End-to-End Bidirectional State Space Model for Audio Deepfake Detection

About

Fake artefacts for discriminating between bonafide and fake audio can exist in both short- and long-range segments. Therefore, combining local and global feature information can effectively discriminate between bonafide and fake audio. This paper proposes an end-to-end bidirectional state space model, named RawBMamba, to capture both short- and long-range discriminative information for audio deepfake detection. Specifically, we use sinc Layer and multiple convolutional layers to capture short-range features, and then design a bidirectional Mamba to address Mamba's unidirectional modelling problem and further capture long-range feature information. Moreover, we develop a bidirectional fusion module to integrate embeddings, enhancing audio context representation and combining short- and long-range information. The results show that our proposed RawBMamba achieves a 34.1\% improvement over Rawformer on ASVspoof2021 LA dataset, and demonstrates competitive performance on other datasets.

Yujie Chen, Jiangyan Yi, Jun Xue, Chenglong Wang, Xiaohui Zhang, Shunbo Dong, Siding Zeng, Jianhua Tao, Lv Zhao, Cunhang Fan• 2024

Related benchmarks

TaskDatasetResultRank
Audio Deepfake DetectionASVspoof DF 2021
EER15.85
35
Audio Deepfake DetectionASVspoof LA 2021
EER2.84
23
Spoofing Attack DetectionASVspoof LA 2021
EER3.21
9
Spoofing Attack DetectionASVspoof DF 2021
EER15.85
8
Showing 4 of 4 rows

Other info

Follow for update