Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Code Generation on LiveCodeBench v1-5
Loading...
30.6
pass@1
SAGE
11.256
16.278
21.3
26.322
Mar 16, 2026
pass@1
Updated 1mo ago
Evaluation Results
Method
Method
Links
pass@1
SAGE
Backbone=Qwen-3-4B-Base
2026.03
30.6
SAGE
Backbone=Qwen-2.5-7B-I...
2026.03
26.4
AZR
Backbone=Qwen-3-4B-Base
2026.03
26.1
AZR
Backbone=Qwen-2.5-7B-I...
2026.03
25.3
MAE
Backbone=Qwen-3-4B-Base
2026.03
24.2
MAE
Backbone=Qwen-2.5-7B-I...
2026.03
23.3
Base Model
Backbone=Qwen-3-4B-Base
2026.03
21.5
Base Model
Backbone=Qwen-2.5-7B-I...
2026.03
17.5
SAGE
Backbone=Qwen-2.5-3B-I...
2026.03
16.9
MAE
Backbone=Qwen-2.5-3B-I...
2026.03
15.9
AZR
Backbone=Qwen-2.5-3B-I...
2026.03
15
Base Model
Backbone=Qwen-2.5-3B-I...
2026.03
12
Feedback
Search any
task
Search any
task