Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

A Recurrent Variational Autoencoder for Speech Enhancement

About

This paper presents a generative approach to speech enhancement based on a recurrent variational autoencoder (RVAE). The deep generative speech model is trained using clean speech signals only, and it is combined with a nonnegative matrix factorization noise model for speech enhancement. We propose a variational expectation-maximization algorithm where the encoder of the RVAE is fine-tuned at test time, to approximate the distribution of the latent variables given the noisy speech observations. Compared with previous approaches based on feed-forward fully-connected architectures, the proposed recurrent deep generative speech model induces a posterior temporal dynamic over the latent variables, which is shown to improve the speech enhancement results.

Simon Leglaive, Xavier Alameda-Pineda, Laurent Girin, Radu Horaud• 2019

Related benchmarks

TaskDatasetResultRank
Speech EnhancementGRID and DEMAND Kitchen noise (test)
SDR-1.21
6
Speech EnhancementGRID and DEMAND Station noise (test)
SDR-7.28
6
Speech EnhancementGRID and DEMAND Metro noise (test)
SDR-3.4
6
Speech EnhancementGRID and DEMAND Cafeteria noise (test)
SDR-7.81
6
Showing 4 of 4 rows

Other info

Follow for update