Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Technical Report of Nomi Team in the Environmental Sound Deepfake Detection Challenge 2026

About

This paper presents our work for the ICASSP 2026 Environmental Sound Deepfake Detection (ESDD) Challenge. The challenge is based on the large-scale EnvSDD dataset that consists of various synthetic environmental sounds. We focus on addressing the complexities of unseen generators and low-resource black-box scenarios by proposing an audio-text cross-attention model. Experiments with individual and combined text-audio models demonstrate competitive EER improvements over the challenge baseline (BEATs+AASIST model).

Candy Olivia Mawalim, Haotian Zhang, Shogo Okada• 2025

Related benchmarks

TaskDatasetResultRank
Environmental Sound Deepfake DetectionESDD Track 2 Black-Box Low-Resource 2026 (val)
EER0.07
4
Environmental Sound Deepfake DetectionESDD Track 2 (Black-Box Low-Resource) 2026 (test)
EER11.98
4
Environmental Sound Deepfake DetectionEnvSDD Track 1 (Unseen Generators) 2026 (val)
EER0.07
4
Environmental Sound Deepfake DetectionEnvSDD Track 1 (Unseen Generators) 2026 (test)
EER (%)11.22
4
Showing 4 of 4 rows

Other info

Follow for update