Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

VCNAC: A Variable-Channel Neural Audio Codec for Mono, Stereo, and Surround Sound

About

We present VCNAC, a variable channel neural audio codec. Our approach features a single encoder and decoder parametrization that enables native inference for different channel setups, from mono speech to cinematic 5.1 channel surround audio. Channel compatibility objectives ensure that multi-channel content maintains perceptual quality when decoded to fewer channels. The shared representation enables training of generative language models on a single set of codebooks while supporting inference-time scalability across modalities and channel configurations. Evaluation using objective spatial audio metrics and subjective listening tests demonstrates that our unified approach maintains high reconstruction quality across mono, stereo, and surround audio configurations.

Florian Gr\"otschla, Arunasish Sen, Alessandro Lombardi, Guillermo C\'ambara, Andreas Schwarz• 2026

Related benchmarks

TaskDatasetResultRank
5.1 Surround Audio ReconstructionBlender open movie collection sequences 2006-2012 (test)
SI-SDR (Front L/R)5.72
7
Speech ReconstructionLibriTTS (test)
PESQ4.16
7
Stereo Music ReconstructionFMA small
SI-SDR9.1
7
Showing 3 of 3 rows

Other info

Follow for update