Share your thoughts, 1 month free Claude Pro on usSee more

Scene-aware visually driven speech synthesis on Vivid-210K (test)

7.15WER

VividVoice

Updated 3mo ago

Evaluation Results

Method	Links
VividVoice 2026.02		7.15	3.98	1.53	0.25	3.95	3.08	4.3	3.88
VoiceLDM 2026.02		9.23	4.74	1.79	0.27	3.23	1.75	2.56	3.41
GT 2026.02		10.62	-	-	0.39	4.36	4.03	4.11	4.25