Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

MP-SENet: A Speech Enhancement Model with Parallel Denoising of Magnitude and Phase Spectra

About

This paper proposes MP-SENet, a novel Speech Enhancement Network which directly denoises Magnitude and Phase spectra in parallel. The proposed MP-SENet adopts a codec architecture in which the encoder and decoder are bridged by convolution-augmented transformers. The encoder aims to encode time-frequency representations from the input noisy magnitude and phase spectra. The decoder is composed of parallel magnitude mask decoder and phase decoder, directly recovering clean magnitude spectra and clean-wrapped phase spectra by incorporating learnable sigmoid activation and parallel phase estimation architecture, respectively. Multi-level losses defined on magnitude spectra, phase spectra, short-time complex spectra, and time-domain waveforms are used to train the MP-SENet model jointly. Experimental results show that our proposed MP-SENet achieves a PESQ of 3.50 on the public VoiceBank+DEMAND dataset and outperforms existing advanced speech enhancement methods.

Ye-Xin Lu, Yang Ai, Zhen-Hua Ling• 2023

Related benchmarks

TaskDatasetResultRank
Speech EnhancementVoiceBank + DEMAND (VB-DMD) (test)
PESQ3.5
105
Speech EnhancementWSJ0 UNI
PESQ2.71
15
Speech EnhancementVCTK+DEMAND (test)
WB-PESQ3.5
13
Speech DenoisingVBDMD (test)
PESQ3.5
12
Speech EnhancementDNS non-blind 2020 (test)
SI-SNR21.03
12
Speech Super-resolutionVBDMD-SR (test)
PESQ3.79
10
Speech RestorationCCF-AATC Challenge 2025 (test)
SIG3.27
7
Speech DenoisingVoiceBank+DEMAND (test)
PESQ3.496
7
General Speech RestorationURGENT 2025 (val)
SCOREQ1.59
7
General Speech RestorationVCTK-GSR (test)
SCOREQ1.87
7
Showing 10 of 23 rows

Other info

Follow for update