TS-URGENet: A Three-stage Universal Robust and Generalizable Speech Enhancement Network

About

Universal speech enhancement aims to handle input speech with different distortions and input formats. To tackle this challenge, we present TS-URGENet, a Three-Stage Universal, Robust, and Generalizable speech Enhancement Network. To address various distortions, the proposed system employs a novel three-stage architecture consisting of a filling stage, a separation stage, and a restoration stage. The filling stage mitigates packet loss by preliminarily filling lost regions under noise interference, ensuring signal continuity. The separation stage suppresses noise, reverberation, and clipping distortion to improve speech clarity. Finally, the restoration stage compensates for bandwidth limitation, codec artifacts, and residual packet loss distortion, refining the overall speech quality. Our proposed TS-URGENet achieved outstanding performance in the Interspeech 2025 URGENT Challenge, ranking 2nd in Track 1.

Xiaobin Rong, Dahan Wang, Qinwen Hu, Yushi Wang, Yuxiang Hu, Jing Lu• 2025

Related benchmarks

Task	Dataset	Result
Universal Speech Enhancement	URGENT non-blind 2025 (test)	DNSMOS3	26
Speech Enhancement	URGENT Challenge 2025 (non-blind test)	DNSMOS3	19
Speech Restoration	URGENT non-blind 2025 (test)	PESQ2.74	11
Speech Restoration	URGENT blind 2025 (test)	UTMOS2.16	8

Showing 4 of 4 rows

Other info

Follow for update

@wizwand_team Discord