Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

General Reasoning on StratQA

87.8Accuracy

Process Supervision

68.76873.70978.6583.591Dec 17, 2025
Updated 3d ago

Evaluation Results

MethodLinks
2025.12
87.8-
2025.12
87.2-
2025.12
87.1-
2025.12
86.6-
2025.12
86.5-
2025.12
86.5-
2025.12
86.4-
2025.12
86.4-
2025.12
86-
2025.12
85.9-
2025.12
85.8-
2025.12
85.6-
2025.12
85.1-
2025.12
85.1-
2025.12
84.8-
2025.12
84.8-
2025.12
84.8-
2025.12
84.8-
2025.12
84.7-
2025.12
84.2-
2025.12
84.2-
2025.12
84.2-
2025.12
83.5-
2025.12
83.5-
2025.12
83.2-
2025.12
83.1-
2025.12
83-
2025.12
82.9-
2025.12
82.7-
2025.12
82.4-
2025.12
82.3-
2025.12
81.7-
2025.12
81.6-
2025.12
81.5-
2025.12
81.4-
2025.12
81-
2025.12
80.4-
2025.12
80.3-58.7
2025.12
80.1-
2025.12
79.9-
2025.12
79.7-
2025.12
79.6-
2025.12
79.3-
2025.12
79.2142.8
2025.12
78.9-
2025.12
78.9-
2025.12
78.9-
2025.12
78.6156.3
2025.12
78.3-
2025.12
78.3-
2025.12
78.2-
2025.12
78.1-
2025.12
78127.6
2025.12
77.9-
2025.12
77.9-53.5
2025.12
77.6-
2025.12
77.5-
2025.12
77.4-
2025.12
77.1345.7
2025.12
77.1-
2025.12
77.1-
2025.12
76.8289.4
2025.12
76.8-
2025.12
76.7-
2025.12
76.5187.2
2025.12
76.5-
2025.12
76.5-
2025.12
76.4-
2025.12
76.1-
2025.12
75.8-
2025.12
75.8-
2025.12
75.20
2025.12
75.2-
2025.12
75.1-
2025.12
74.8-
2025.12
74.7-
2025.12
74.6-
2025.12
74.2-
2025.12
74.1-54.2
2025.12
74-
2025.12
73.8-
2025.12
73.5-50.8
2025.12
73.5-
2025.12
73.4-
2025.12
73.1-
2025.12
72.8-
2025.12
72.3-
2025.12
71.8-
2025.12
71.8-
2025.12
70.9-
2025.12
69.5-