Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Scene-aware visually driven speech synthesis on Vivid-210K (test)

7.15WER

VividVoice

7.01127.94818.8859.8219Feb 1, 2026
Updated 1mo ago

Evaluation Results

MethodLinks
2026.02
7.153.981.530.253.953.084.33.88
2026.02
9.234.741.790.273.231.752.563.41
2026.02
10.62--0.394.364.034.114.25