Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Zero-shot Evaluation on Non-reasoning tasks

70.8Accuracy (Zero-shot Non-reasoning)

FP16

66.43267.56668.769.834Nov 13, 2025
Updated 4d ago

Evaluation Results

MethodLinks
2025.11
70.8-
2025.11
69.4-
2025.11
67.7-
2025.11
66.6-
2025.11
-70.8
2025.11
-68.4
2025.11
-69.6