Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Code Reasoning on MBPP (Accuracy, Token Cost)

94.55Accuracy

COPT

15.87436.299556.72577.1505Sep 30, 2025Nov 7, 2025Dec 16, 2025Jan 23, 2026Mar 3, 2026Apr 10, 2026May 19, 2026
Updated 13d ago

Evaluation Results

MethodLinks
2026.05
94.551,997
2026.05
94.162,033
2026.05
91.442,724
2025.12
67.3-58.7
2025.12
66.1142.8
2025.12
65.8156.3
2025.12
64.9127.6
2025.12
64.5345.7
2025.12
64.2-53.5
2025.12
63.4289.4
2025.12
62.7187.2
2025.12
61.80
2025.12
60.1-54.2
2025.12
58.5-50.8
2025.09
38.8-
2025.09
33.4-
2025.09
32.2-
2025.09
31.5-
2025.09
30.8-
2025.09
26.3-
2025.09
24.7-
2025.09
24.4-
2025.09
24-
2025.09
23.7-
2025.09
19.3-
2025.09
18.9-