Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Translation on Custom English-to-Hebrew
Loading...
30.09
Win Rate (GPT-4o Judge)
DictaLM-3.0-24B-Thinking
-1.2036
6.9207
15.045
23.1693
Feb 2, 2026
Win Rate (GPT-4o Judge)
Updated 2d ago
Evaluation Results
Method
Method
Links
Win Rate (GPT-4o Judge)
DictaLM-3.0-24B-Thinking
Size Category=Large, R...
2026.02
30.09
gemma3-27B-it
Size Category=Large
2026.02
26.73
Llama-3.3-70B-Instruct
Size Category=Large
2026.02
19.31
aya-expanse-32B
Size Category=Large
2026.02
17.1
gemma-3-12b-it
Size Category=Smaller...
2026.02
16.5
DictaLM-3.0-Nemotron-12B-Instruct
Size Category=Smaller...
2026.02
13.5
DictaLM-3.0-1.7B-Thinking
Size Category=Tiny (~1...
2026.02
2.51
DictaLM-3.0-1.7B-Instruct
Size Category=Tiny (~1...
2026.02
2.16
Qwen3-14B (think)
Size Category=Smaller...
2026.02
0.9
gemma-3-1b-it
Size Category=Tiny (~1...
2026.02
0.15
Qwen3-1.7B (think)
Size Category=Tiny (~1...
2026.02
0
Feedback
Search any
task
Search any
task