Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Output Length Prediction on MBPP (test)
Loading...
25.18
MAE
Noise Radius
23.896
32.563
41.23
49.897
Apr 9, 2026
MAE
Updated 9d ago
Evaluation Results
Method
Method
Links
MAE
Noise Radius
Served Model=Qwen-2.5-7B
2026.04
25.18
ProD-D
Served Model=Llama-3-8B
2026.04
26.61
Noise Radius
Served Model=Llama-3-8B
2026.04
27.07
ProD-M
Served Model=Llama-3-8B
2026.04
29.96
ProD-D
Served Model=Qwen-2.5-7B
2026.04
31.76
ProD-M
Served Model=Qwen-2.5-7B
2026.04
32.08
TRAIL-last
Served Model=Llama-3-8B
2026.04
35.74
TRAIL-last
Served Model=Qwen-2.5-7B
2026.04
36.63
Constant Median
Served Model=Llama-3-8B
2026.04
37.83
TRAIL-mean
Served Model=Llama-3-8B
2026.04
37.93
EGTP
Served Model=Llama-3-8B
2026.04
37.98
S^3
Served Model=Llama-3-8B
2026.04
41.28
TRAIL-mean
Served Model=Qwen-2.5-7B
2026.04
44.7
Constant Median
Served Model=Qwen-2.5-7B
2026.04
56.41
EGTP
Served Model=Qwen-2.5-7B
2026.04
57.14
S^3
Served Model=Qwen-2.5-7B
2026.04
57.28
Feedback
Search any
task
Search any
task