Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Instruction on IFEval
Loading...
89.8
Score
Qwen3.5-4B
54.44
63.62
72.8
81.98
May 6, 2026
May 8, 2026
May 10, 2026
May 12, 2026
May 14, 2026
May 16, 2026
May 18, 2026
Score
Updated 15d ago
Evaluation Results
Method
Method
Links
Score
Qwen3.5-4B
Active / Total Paramet...
2026.05
89.8
Gemma-4-E4B-it
Active / Total Paramet...
2026.05
88.5
Qwen3-4B-Thinking-2507
Active / Total Paramet...
2026.05
86.8
ZAYA1-8B
Active / Total Paramet...
2026.05
85.6
Elastic
Model=LLaDA2.0-mini
2026.05
85.03
Qwen2.5-72B
Parameters=72B
2026.05
84.3
Baseline
Model=LLaDA2.0-mini
2026.05
83.55
Qwen2.5-14B
Parameters=14B
2026.05
81.6
FORGE-7B-MATH
Parameters=7B, Configu...
2026.05
80.5
FORGE-7B-NP
Parameters=7B, Configu...
2026.05
79.6
Qwen2.5-32B
Parameters=32B
2026.05
79.5
FORGE-7B-NP-MATH
Parameters=7B, Configu...
2026.05
79.3
Qwen2.5-7B
Parameters=7B
2026.05
73.5
InternLM3-8B
Parameters=8B
2026.05
72.5
LLama3.1-8b
Parameters=8B
2026.05
69.3
Qwen2.5-3B
Parameters=3B
2026.05
59.1
Qwen2.5-7B-SFT
Parameters=7B, Configu...
2026.05
55.8
Feedback
Search any
task
Search any
task