Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Code Generation on MBPP-ET (Acc, RSR)
Loading...
73
Acc
EG-CFG
41.384
49.592
57.8
66.008
Jun 12, 2025
Acc
RSR
Updated 4d ago
Evaluation Results
Method
Method
Links
Acc
RSR
EG-CFG
Model=DeepSeek-V3-0324
2025.06
73
23.29
MapCoder
Model=DeepSeek-V3-0324
2025.06
69.6
13.63
LPW
Model=GPT-4o
2025.06
65.8
-
LPW
Model=DeepSeek-V3-0324
2025.06
65.2
1.13
Baseline LLM
Model=DeepSeek-V3-0324
2025.06
64.8
0
MGDebugger
Model=DeepSeek-V3-0324
2025.06
64.8
0
Self-Collaboration
Model=GPT-4
2025.06
62.1
-
EG-CFG
Model=DeepSeek-Coder 1.3B
2025.06
59.8
29.96
MapCoder
Model=GPT-4
2025.06
57.5
-
Baseline LLM
Model=GPT-4
2025.06
49.2
-
MapCoder
Model=DeepSeek-Coder 1.3B
2025.06
46.2
6.27
MGDebugger
Model=DeepSeek-Coder 1.3B
2025.06
44.6
3.48
Baseline LLM
Model=DeepSeek-Coder 1.3B
2025.06
42.6
0
Feedback
Search any
task
Search any
task