Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

General Performance on Performance Bench Reasoning & Knowledge (Average)

78.37Average Score

DeepSeek-R1-Distill-Qwen-14B (Reasoning)

50.695657.880365.06572.2497Jan 9, 2026
Updated 2d ago

Evaluation Results

MethodLinks
2026.01
78.37
2026.01
75.4
2026.01
68.11
2026.01
65.11
2026.01
63.88
2026.01
62.6
2026.01
61.07
2026.01
59.29
2026.01
51.76