Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Summarization on Custom Hebrew Summarization (test)
Loading...
56.86
Win Rate (GPT-4o)
DictaLM-3.0-24B-Thinking
-1.9104
13.3473
28.605
43.8627
Feb 2, 2026
Win Rate (GPT-4o)
Updated 2d ago
Evaluation Results
Method
Method
Links
Win Rate (GPT-4o)
DictaLM-3.0-24B-Thinking
Size Category=Large, R...
2026.02
56.86
gemma3-27B-it
Size Category=Large
2026.02
44.54
gemma-3-12b-it
Size Category=Smaller...
2026.02
39.48
Llama-3.3-70B-Instruct
Size Category=Large
2026.02
37.83
DictaLM-3.0-Nemotron-12B-Instruct
Size Category=Smaller...
2026.02
33.27
aya-expanse-32B
Size Category=Large
2026.02
29.46
Qwen3-14B (think)
Size Category=Smaller...
2026.02
15.83
DictaLM-3.0-1.7B-Thinking
Size Category=Tiny (~1...
2026.02
10.22
DictaLM-3.0-1.7B-Instruct
Size Category=Tiny (~1...
2026.02
9.72
Qwen3-1.7B (think)
Size Category=Tiny (~1...
2026.02
0.4
gemma-3-1b-it
Size Category=Tiny (~1...
2026.02
0.35
Feedback
Search any
task
Search any
task