Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Reasoning on Reasoning Tasks Average

68.6Average Score

Llama-3-8B

33.55242.65151.7560.849Mar 17, 2025Apr 26, 2025Jun 5, 2025Jul 15, 2025Aug 24, 2025Oct 3, 2025Nov 13, 2025
Updated 4d ago

Evaluation Results

MethodLinks
2025.03
68.6
2025.03
68.4
2025.03
68.2
2025.03
67.9
2025.03
67.3
2025.03
64.9
2025.03
64.5
2025.03
64.4
2025.03
63.9
2025.03
63.7
2025.11
62.8
2025.11
61.9
2025.03
61.7
2025.11
61
2025.11
59.5
2025.03
57.8
2025.03
56.4
2025.11
55.6
2025.03
53.6
2025.03
51.7
2025.03
44.9
2025.03
40.8
2025.03
40.2
2025.03
38.8
2025.03
36.8
2025.03
36.2
2025.03
36
2025.03
35.6
2025.03
35.5
2025.03
35.3
2025.03
34.9
2025.03
34.9