Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Scientific Reasoning on ARC Challenge

92.5Accuracy

Qwen-3-32B

-3.2070421.6399846.48771.33402Dec 1, 2025Dec 13, 2025Dec 25, 2025Jan 6, 2026Jan 18, 2026Jan 30, 2026Feb 12, 2026
Updated 3d ago

Evaluation Results

MethodLinks
2026.02
92.5-
2026.02
92.3-
2026.02
91.5-
2026.02
91.4-
2026.02
90.3-
2026.02
90.1-
2026.02
90.1-
2026.02
90-
2026.02
89.6-
2026.02
89.2-
2026.02
88.4-
2026.02
87.2-
2026.02
84.6-
2026.02
82.7-
2026.02
79.6-
2026.02
79-
2026.02
78.7-
2026.02
78.6-
2026.02
72.2-
2026.02
69.9-
2026.02
68.1-
66.9-
2026.02
54.6-
2026.02
44.8-
2026.02
44.4-
2026.02
44.36-
2026.02
44.28-
2026.02
43.09-
2026.02
41.98-
2026.02
41.97-
2026.02
40.35-
2026.02
40.2-
2026.02
38.99-
2026.02
38.82-
2026.02
38.73-
2026.02
35.24-
2026.02
34.89-
2026.02
34.47-
2026.02
34.13-
2026.02
33.78-
2026.02
32.85-
2025.12
0.8260.705
2025.12
0.8220.652
2025.12
0.820.687
2025.12
0.8170.578
2025.12
0.8140.691
2025.12
0.8110.682
2025.12
0.7760.603
2025.12
0.7420.581
2025.12
0.7370.579
2025.12
0.7150.572
2025.12
0.6820.556
2025.12
0.6780.519
2025.12
0.6370.635
2025.12
0.6020.433
2025.12
0.4740.461