Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

AudioDec: An Open-source Streaming High-fidelity Neural Audio Codec

About

A good audio codec for live applications such as telecommunication is characterized by three key properties: (1) compression, i.e.\ the bitrate that is required to transmit the signal should be as low as possible; (2) latency, i.e.\ encoding and decoding the signal needs to be fast enough to enable communication without or with only minimal noticeable delay; and (3) reconstruction quality of the signal. In this work, we propose an open-source, streamable, and real-time neural audio codec that achieves strong performance along all three axes: it can reconstruct highly natural sounding 48~kHz speech signals while operating at only 12~kbps and running with less than 6~ms (GPU)/10~ms (CPU) latency. An efficient training paradigm is also demonstrated for developing such neural audio codecs for real-world scenarios. Both objective and subjective evaluations using the VCTK corpus are provided. To sum up, AudioDec is a well-developed plug-and-play benchmark for audio codec applications.

Yi-Chiao Wu, Israel D. Gebru, Dejan Markovi\'c, Alexander Richard• 2023

Related benchmarks

TaskDatasetResultRank
Audio CodingLibriTTS Out of Domain 24 kHz 6 kbps (test)
SI-SDR-19.57
4
Audio CodingLibriTTS In Domain, 24 kHz, 6 kbps (test)
SI-SDR-14.48
4
Showing 2 of 2 rows

Other info

Follow for update