Speech Reconstruction

Benchmarks

Dataset Name	SOTA Method	Metric
LibriTTS clean (test)		PESQ4.644	67	18d ago
LibriSpeech (test-clean)	StableCodec	UT MOS4.23	64	1mo ago
LibriSpeech clean (test)	StableCodec	UTMOS Score4.31	60	25d ago
LibriTTS (test-other)	CleanCodec@12.5	UTMOS4.17	57	1mo ago
AISHELL-2 Chinese	MOSS-Audio-Tokenizer	SIM0.93	54	4mo ago
LibriSpeech English (test-clean)	MOSS-Audio-Tokenizer	SIM0.97	54	4mo ago
LibriTTS + ESC-50 clean-100 (test)	FocalSE	PESQ2.97	48	18d ago
SEED-ZH	MingTok-Audio	PESQ4.21	29	1mo ago
SeedTTS en (test)		WER0.0214	21	1mo ago
Chinese speech	SAC	UTMOS2.99	19	4mo ago
English speech	WavTokenizer	UTMOS3.92	19	4mo ago
Salmon Sentiment Consistency emotional 2025b (OOD)		WER2.9	18	4mo ago
Seed-TTS English	DashengTokenizer	PESQ4.125	17	1mo ago
SEED-EN	H-Codec-2.0 (Large)	PESQ2.77	12	4mo ago
Open Track 2 (test)	Baseline	ScoreQ-ref1.15	12	4mo ago
Open Track 1 (test)	Baseline	ScoreQ-ref1.36	12	4mo ago
MLS non-English OOD (700 utterances (7 languages))	DTM-Codec	UTMOS3.16	10	25d ago
ReSSInt laryngeal subjects (test)	Masked Multimodal Speech Synthesis Framework	WER40.5	9	1mo ago
LibriSpeech other (test)	UniAudio-Token	WER6.79	9	1mo ago
MLS (Multilingual LibriSpeech) Non-English (test)	Mimi-32	WER7.3	9	4mo ago
English Read by Japanese accented speech 2007 (OOD)		WER14.9	9	4mo ago
Japanese Versatile Speech unseen language speech 2019 (OOD)		WER4.6	9	4mo ago
Gigaspeech noisy speech 2021 (OOD)		WER9.7	9	4mo ago
LibriSpeech short splits (4s-10s) (test-clean)	AugCodec-3	WER5.12	8	1mo ago
Librispeech (test)	MSR-Codec-612	STOI0.9	8	4mo ago

Showing 25 of 61 rows