| Task Name | Dataset Name | SOTA Result | Trend | |
|---|---|---|---|---|
| VDM reconstruction | EARS RT60 = 0.6 s (test) | SDR19.7 | 13 | |
| VDM reconstruction | EARS RT60 = 0.4 s (test) | SDR20.37 | 13 | |
| VDM reconstruction | EARS RT60 = 0.2 s (test) | SDR22.12 | 13 | |
| Neural Vocoding | EARS (out-of-domain) | UTMOS3.3 | 9 | |
| Zero-shot Text-to-Speech | EARS (unseen speakers) | WER1.65 | 7 | |
| Speech Synthesis | EARS | PESQ4.238 | 6 | |
| Automatic Speech Recognition | EARS-Reverb | WER (%)3.83 | 6 | |
| Diffuse sound extraction | EARS RT60 = 0.6 s (test) | SDR8.22 | 5 | |
| Diffuse sound extraction | EARS RT60 = 0.4 s (test) | SDR7.26 | 5 | |
| Diffuse sound extraction | EARS RT60 = 0.2 s (test) | SDR3.99 | 5 |