Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Universal Speech Enhancement with Regression and Generative Mamba

About

The Interspeech 2025 URGENT Challenge aimed to advance universal, robust, and generalizable speech enhancement by unifying speech enhancement tasks across a wide variety of conditions, including seven different distortion types and five languages. We present Universal Speech Enhancement Mamba (USEMamba), a state-space speech enhancement model designed to handle long-range sequence modeling, time-frequency structured processing, and sampling frequency-independent feature extraction. Our approach primarily relies on regression-based modeling, which performs well across most distortions. However, for packet loss and bandwidth extension, where missing content must be inferred, a generative variant of the proposed USEMamba proves more effective. Despite being trained on only a subset of the full training data, USEMamba achieved 2nd place in Track 1 during the blind test phase, demonstrating strong generalization across diverse conditions.

Rong Chao, Rauf Nasretdinov, Yu-Chiang Frank Wang, Ante Juki\'c, Szu-Wei Fu, Yu Tsao• 2025

Related benchmarks

TaskDatasetResultRank
Speech EnhancementURGENT Challenge 2025 (non-blind test)
DNSMOS3.01
19
General Speech RestorationDNS-Real Out-Domain (test)
SIG3.239
17
Universal Speech EnhancementURGENT non-blind 2025 (test)
DNSMOS3.01
9
Speech RestorationCCF-AATC Challenge 2025 (test)
SIG3.36
7
General Speech RestorationURGENT 2025 (val)
SCOREQ1.77
7
General Speech RestorationURGENT 2025 (test)
SCOREQ1.6
7
General Speech RestorationVCTK-GSR (test)
SCOREQ1.87
7
Showing 7 of 7 rows

Other info

Follow for update