Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Text-To-Speech on Downstream Audio Generation TTS
Loading...
3.03
WER
LoSATok
-0.8488
25.3331
51.515
77.6969
May 27, 2026
WER
SIM Score
UTMOS
Updated 6d ago
Evaluation Results
Method
Method
Links
WER
SIM Score
UTMOS
LoSATok
Latent Dim.=128, DiT D...
2026.05
3.03
54.8
3.367
UniFlow-Audio
Latent Dim.=128, DiT D...
2026.05
3.589
40.8
2.768
DashengTokenizer
Latent Dim.=1280, DiT...
2026.05
3.652
28.7
3.144
LoSATok
Latent Dim.=128, DiT D...
2026.05
3.667
50.7
3.31
UniFlow-Audio
Latent Dim.=128, DiT D...
2026.05
4.529
34.4
2.259
DashengTokenizer
Latent Dim.=1280, DiT...
2026.05
75.469
10.3
1.322
DashengTokenizer
Latent Dim.=1280, DiT...
2026.05
84.761
7.4
1.296
DashengTokenizer
Latent Dim.=1280, DiT...
2026.05
100
1.5
1.251
Feedback
Search any
task
Search any
task