Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Code Reasoning on HumanE

84.9Accuracy

Denser

69.61273.58177.5581.519Dec 17, 2025
Updated 4d ago

Evaluation Results

MethodLinks
2025.12
84.9-
2025.12
84.6-
2025.12
84-
2025.12
83.7-
2025.12
83.2-
2025.12
83.1-
2025.12
82.9-
2025.12
82.5-
2025.12
82.3-
2025.12
81.9-
2025.12
81.8-
2025.12
81.6-
2025.12
80.9-
2025.12
80.8-
2025.12
80.7-
2025.12
80.2-
2025.12
80.1-
2025.12
79.8-
2025.12
79.5-
2025.12
78.9-
2025.12
78.2-
2025.12
78.2-
2025.12
78.2-58.7
2025.12
77.5-
2025.12
76.9142.8
2025.12
76.3156.3
2025.12
75.8-
2025.12
75.8-53.5
2025.12
75.4127.6
2025.12
75345.7
2025.12
74.1289.4
2025.12
73.8187.2
2025.12
72.90
2025.12
71.5-54.2
2025.12
70.2-50.8