Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Code Debugging on HumanEval

96.3Accuracy

MGDebugger

74.77280.36185.9591.539Oct 2, 2024
Updated 3mo ago

Evaluation Results

MethodLinks
2024.10
96.3-
2024.10
95.7-
2024.10
94.5-
2024.10
94.576.3
2024.10
94.577.5
2024.10
93.9-
2024.10
93.3-
2024.10
93.3-
2024.10
92.7-
2024.10
91.564.1
2024.10
90.960.5
2024.10
90.2-
2024.10
89.6-
2024.10
89.6-
2024.10
89.657.5
2024.10
88.452.5
2024.10
87.848.7
2024.10
87.848.7
2024.10
87.244.7
2024.10
86.6-
2024.10
86.645
2024.10
8639.5
2024.10
8642.5
2024.10
85.438.5
2024.10
84.835.9
2024.10
84.131.6
2024.10
84.133.3
2024.10
84.135
2024.10
83.532.5
2024.10
83.532.5
2024.10
82.926.3
2024.10
82.323.7
2024.10
82.323.7
2024.10
82.327.5
2024.10
81.721.1
2024.10
80.517.9
2024.10
79.915.4
2024.10
79.3-
2024.10
79.312.8
2024.10
76.8-
2024.10
76.2-
2024.10
75.6-