Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Coding on MBPP (Solve Rate)
Loading...
83.28
Solve Rate
CapFlow
30.9888
44.5644
58.14
71.7156
Feb 11, 2026
Feb 16, 2026
Feb 22, 2026
Feb 27, 2026
Mar 5, 2026
Mar 10, 2026
Mar 16, 2026
Solve Rate
Executability
Updated 1mo ago
Evaluation Results
Method
Method
Links
Solve Rate
Executability
CapFlow
Type=Learning, Setting...
2026.02
83.28
-
AFlow
Type=Refinement, Setti...
2026.02
82.99
-
ScoreFlow
Type=Learning, Setting...
2026.02
82.69
-
CapFlow
Type=Learning, Setting...
2026.02
82.11
-
ScoreFlow
Type=Learning, Setting...
2026.02
81.23
-
CoT-SC
Type=Manual, Setting=M...
2026.02
72.72
-
SPP
Type=Manual, Setting=M...
2026.02
72.72
-
CoT
Type=Manual, Setting=M...
2026.02
70.96
-
GPT-4o-mini
Type=Manual, Setting=M...
2026.02
70.67
-
ADAS
Type=Refinement, Setti...
2026.02
70.08
-
Self-Refine
Type=Manual, Setting=M...
2026.02
69.5
-
Llama3.2
Number of Parameters=1...
2026.03
39.6
-
MobileLLM-Flash
Number of Parameters=1...
2026.03
35.6
-
Gemma3
Number of Parameters=1...
2026.03
35.2
-
MobileLLM-Flash
Number of Parameters=6...
2026.03
33
-
Feedback
Search any
task
Search any
task