Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

SepIt: Approaching a Single Channel Speech Separation Bound

About

We present an upper bound for the Single Channel Speech Separation task, which is based on an assumption regarding the nature of short segments of speech. Using the bound, we are able to show that while the recent methods have made significant progress for a few speakers, there is room for improvement for five and ten speakers. We then introduce a Deep neural network, SepIt, that iteratively improves the different speakers' estimation. At test time, SpeIt has a varying number of iterations per test sample, based on a mutual information criterion that arises from our analysis. In an extensive set of experiments, SepIt outperforms the state-of-the-art neural networks for 2, 3, 5, and 10 speakers.

Shahar Lutati, Eliya Nachmani, Lior Wolf• 2022

Related benchmarks

TaskDatasetResultRank
Speech SeparationWSJ0-2Mix (test)--
141
Speech SeparationWSJ0-3mix (test)--
29
Source SeparationLibriSpeech 2Mix
SI-SDRi23.1
10
Speech SeparationLibri-5Mix
SI-SDRi (dB)14.5
9
Speech SeparationLibri-10Mix
SI-SDRi (dB)12
9
Source SeparationWSJ0 3mix
SI-SDRi21.2
8
Audio SeparationLibri5Mix (test)
SI-SDRi (dB)13.2
6
Showing 7 of 7 rows

Other info

Follow for update