Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

EBEN: Extreme bandwidth extension network applied to speech signals captured with noise-resilient body-conduction microphones

About

In this paper, we present Extreme Bandwidth Extension Network (EBEN), a Generative Adversarial network (GAN) that enhances audio measured with body-conduction microphones. This type of capture equipment suppresses ambient noise at the expense of speech bandwidth, thereby requiring signal enhancement techniques to recover the wideband speech signal. EBEN leverages a multiband decomposition of the raw captured speech to decrease the data time-domain dimensions, and give better control over the full-band signal. This multiband representation is fed to a U-Net-like model, which adopts a combination of feature and adversarial losses to recover an enhanced audio signal. We also benefit from this original representation in the proposed discriminator architecture. Our approach can achieve state-of-the-art results with a lightweight generator and real-time compatible operation.

Julien Hauret, Thomas Joubaud, V\'eronique Zimpfer, \'Eric Bavu• 2022

Related benchmarks

TaskDatasetResultRank
Speech EnhancementVCTK Vibration sensor 12-bit, 4-16 kHz upsampling (test)
LSD (Log-Spectral Distance)1.15
18
Speech EnhancementVCTK Accelerometer 12-bit, 4-16 kHz upsampling (test)
LSD1.21
18
Bandwidth Extension (4-22 kHz upsampling)MagnaTagATune (test)
LSD1.17
15
Bandwidth Extension (BWE)VCTK Desktop
LSD1.13
10
Bandwidth Extension (BWE)VCTK Google Pixel7
LSD1.16
10
Speech EnhancementAir- and Bone-Conducted Synchronized Speech corpus (test)
SI-SDR0.8
9
Bandwidth Extension (4-22 kHz upsampling)VCTK (test)
LSD1.19
7
Bandwidth Extension (8-22 kHz upsampling)VCTK (test)
LSD1.06
7
Bandwidth Extension (8-22 kHz upsampling)MagnaTagATune (test)
LSD1.08
6
Showing 9 of 9 rows

Other info

Follow for update