Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Script Confusion Mitigation on FLEURS sr-latn (test)
Loading...
96
Accuracy (Normalized Edit Similarity)
steer
61.68
70.59
79.5
88.41
Jan 6, 2026
Accuracy (Normalized Edit Similarity)
Updated 4d ago
Evaluation Results
Method
Method
Links
Accuracy (Normalized Edit Similarity)
steer
Model Size=large
2026.01
96
steer
Model Size=v2
2026.01
96
steer
Model Size=v3
2026.01
96
prompt
Model Size=large
2026.01
95
prompt
Model Size=v2
2026.01
95
prompt
Model Size=v3
2026.01
95
no-prompt
Model Size=large
2026.01
93
no-prompt
Model Size=v3
2026.01
93
prompt
Model Size=medium
2026.01
93
steer
Model Size=medium
2026.01
93
no-prompt
Model Size=small
2026.01
90
prompt
Model Size=small
2026.01
90
steer
Model Size=small
2026.01
89
no-prompt
Model Size=base
2026.01
81
no-prompt
Model Size=v2
2026.01
81
prompt
Model Size=base
2026.01
81
steer
Model Size=base
2026.01
80
no-prompt
Model Size=tiny
2026.01
66
prompt
Model Size=tiny
2026.01
64
steer
Model Size=tiny
2026.01
64
no-prompt
Model Size=medium
2026.01
63
Feedback
Search any
task
Search any
task