Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
General Performance on Performance Bench Reasoning & Knowledge (Average)
Loading...
78.37
Average Score
DeepSeek-R1-Distill-Qwen-14B (Reasoning)
50.6956
57.8803
65.065
72.2497
Jan 9, 2026
Average Score
Updated 2d ago
Evaluation Results
Method
Method
Links
Average Score
DeepSeek-R1-Distill-Qwen-14B (Reasoning)
Model Family=Qwen2.5-14B
2026.01
78.37
ReasonAny
Model Family=Qwen2.5-14B
2026.01
75.4
Task Arithmetic
Model Family=Qwen2.5-14B
2026.01
68.11
Qwen2.5-14B-Instruct (Safety)
Model Family=Qwen2.5-14B
2026.01
65.11
LED
Model Family=Qwen2.5-14B
2026.01
63.88
TIES
Model Family=Qwen2.5-14B
2026.01
62.6
Linear
Model Family=Qwen2.5-14B
2026.01
61.07
DARE
Model Family=Qwen2.5-14B
2026.01
59.29
FuseLLM
Model Family=Qwen2.5-14B
2026.01
51.76
Feedback
Search any
task
Search any
task