Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Long-context Evaluation on Ruler (Average rank)
Loading...
2
Average Rank
Ministral-3-8B
1.452
5.151
8.85
12.549
May 8, 2026
Average Rank
Updated 23d ago
Evaluation Results
Method
Method
Links
Average Rank
Ministral-3-8B
Parameters=8B
2026.05
2
Qwen3-8B
Parameters=8B
2026.05
2.3
Llama-3.1-8B
Parameters=8B
2026.05
2.7
Qwen3-4B
Parameters=4B
2026.05
4.3
gemma-3-12b-it
Parameters=12B, Instru...
2026.05
5
GPT-5 nano
2026.05
6
Llama-3.2-3B
Parameters=3B
2026.05
6.7
gemma-3-4b-it
Parameters=4B, Instruc...
2026.05
9
gpt-oss-20b
Parameters=20B
2026.05
9.3
Moonlight-16B-A3B
Parameters=16B
2026.05
10.3
EngGPT2-16B-A3B
Parameters=16B
2026.05
11.3
LLaMAntino-3-8B
Parameters=8B
2026.05
11.3
FastwebMIIA-7B
Parameters=7B
2026.05
12
Velvet-14B
Parameters=14B
2026.05
13.7
deepseek-moe-16b
Parameters=16B
2026.05
14.3
Minerva-7B
Parameters=7B
2026.05
15.7
Feedback
Search any
task
Search any
task