Neural Vocoding

Benchmarks

Dataset Name	SOTA Method	Metric
LibriTTS (test)	BigVGAN	PESQ4.269	18	4mo ago
LibriTTS-R clean-100 (test)		NISQA4.08	15	2mo ago
LJSpeech 1.1 (test)	BigVGAN	M-STFT0.9	12	4mo ago
LJSpeech 88 (test)	BigVGAN	M-STFT0.9	12	4mo ago
LibriTTS		UTMOS4.058	12	4mo ago
VCTK 100 audio clips (unseen)	BigVGAN	MAE0.0925	10	5mo ago
LibriTTS clean (dev)	BigVGAN	MAE0.0931	10	5mo ago
VCTK English Corpus with Unseen Speakers (out-of-domain)		UTMOS4.117	9	4mo ago
EARS (out-of-domain)		UTMOS3.3	9	4mo ago
LJSpeech	DiffWave	MOS4.49	9	5mo ago
LJSpeech (Long Audio)		MOS4.73	8	4mo ago
LJSpeech Short Audio		MOS3.67	8	4mo ago
VCTK (unseen speakers)		MOS4.37	8	5mo ago
LJSpeech and VCTK		MOS4.6	6	5mo ago
Inference Speed Benchmark batch size 16, 1s samples	BigVGAN	xRT (GPU)98.61	5	5mo ago
MUSDB18 (out-of-distribution)	Vocos	Mixture Score4.61	4	5mo ago
Deeply Korean		SMOS4.847	3	4mo ago
LJSpeech (test)	RNDVoC-Lite	PESQ3.769	3	4mo ago

Showing 18 of 18 rows