Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Commonsense Reasoning on Winogrande (Score)
Loading...
72.77
Score
Gemma3-4B
53.3532
58.3941
63.435
68.4759
Jan 27, 2026
Feb 10, 2026
Feb 25, 2026
Mar 11, 2026
Mar 26, 2026
Apr 9, 2026
Apr 24, 2026
Score
Updated 1mo ago
Evaluation Results
Method
Method
Links
Score
Gemma3-4B
Model Scale=4B
2026.04
72.77
Llama3.2-3B
Model Scale=3B
2026.04
72.38
Qwen3-4B
Model Scale=4B
2026.04
70.48
Qwen2.5-3B
Model Scale=3B
2026.04
69.46
SpB2.0-5B
Model Scale=5B, Additi...
2026.04
68.35
Pre-LN
Architecture=512 Layer...
2026.01
66.7
KEEL
Architecture=512 Layer...
2026.01
66.7
DP
Model Scale=Gemma 9B,...
2026.04
64.2
Decoupled DiLoCo
Model Scale=Gemma 9B,...
2026.04
63.8
Decoupled DiLoCo
Model Scale=Gemma 5B,...
2026.04
61.3
DP
Model Scale=Gemma 5B,...
2026.04
59.6
DP
Model Scale=Gemma 2B,...
2026.04
55.6
Decoupled DiLoCo
Model Scale=Gemma 2B,...
2026.04
54.1
Feedback
Search any
task
Search any
task