Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Zero-shot Text-to-Speech on LibriSpeech SNR = ∞ (test-clean)
Loading...
3.76
UTMOS
F5-TTS
2.3248
2.6974
3.07
3.4426
May 19, 2025
UTMOS
WER
Similarity (Objective)
Similarity (Reference)
F0 Accuracy
F0 RMSE
Energy Accuracy
Energy RMSE
Updated 4d ago
Evaluation Results
Method
Method
Links
UTMOS
WER
Similarity (Objective)
Similarity (Reference)
F0 Accuracy
F0 RMSE
Energy Accuracy
Energy RMSE
F5-TTS
Training Dataset=LibriTTS
2025.05
3.76
24
53
-
80
13.78
67
0.01
VALL-E
Training Dataset=LibriTTS
2025.05
3.68
19
40
48
75
21.66
36
0.02
VoiceCraft
Training Dataset=GigaS...
2025.05
3.55
18
51
45
78
17.22
44
0.01
OZSpeech
Training Dataset=Libri...
2025.05
3.19
6
39
46
78
13.67
65
0.01
OZSpeech
Training Dataset=LibriTTS
2025.05
3.15
5
39
47
81
11.96
67
0.01
NaturalSpeech 2
Training Dataset=LibriTTS
2025.05
2.38
9
31
38
80
15.62
25
0.02
Feedback
Search any
task
Search any
task