Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Zero-shot Evaluation on Reasoning tasks

70.7Reasoning Accuracy

FP16

-2.1894416.7337835.65754.58022Nov 13, 2025
Updated 3d ago

Evaluation Results

MethodLinks
2025.11
70.7
2025.11
63.9
2025.11
63.6
2025.11
61.7
2025.11
0.707
2025.11
0.625
2025.11
0.614