Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Code Synthesis on MBPP Sanitized (Acc, Steps)
Loading...
64.59
Accuracy
KLASS
26.682
36.5235
46.365
56.2065
Nov 7, 2025
Accuracy
Steps
Updated 1mo ago
Evaluation Results
Method
Method
Links
Accuracy
Steps
KLASS
Parallel=Yes, Backbone...
2025.11
64.59
111.24
Top-1
Parallel=No, Backbone=...
2025.11
63.81
256
KL divergence
Parallel=Yes, Backbone...
2025.11
62.65
108.15
Confidence
Parallel=Yes, Backbone...
2025.11
57.59
72.49
KLASS
Parallel=Yes, Backbone...
2025.11
47.86
119.59
Confidence
Parallel=Yes, Backbone...
2025.11
47.08
85.2
Top-2
Parallel=Yes, Backbone...
2025.11
47.08
128
Top-1
Parallel=No, Backbone=...
2025.11
46.69
256
KL divergence
Parallel=Yes, Backbone...
2025.11
45.53
150.47
Top-2
Parallel=Yes, Backbone...
2025.11
37.74
128
Random
Parallel=No, Backbone=...
2025.11
29.18
256
Random
Parallel=No, Backbone=...
2025.11
28.14
256
Feedback
Search any
task
Search any
task