Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Code Generation on BigCodeBench (avg@32)
Loading...
52.46
avg@32
Code-A1
28.4152
34.6576
40.9
47.1424
Mar 16, 2026
avg@32
Updated 1mo ago
Evaluation Results
Method
Method
Links
avg@32
Code-A1
Code LLM=Qwen2.5-Coder...
2026.03
52.46
Golden Tests
Code LLM=Qwen2.5-Coder...
2026.03
52.28
Self-Play
Code LLM=Qwen2.5-Coder...
2026.03
52.25
/
Code LLM=Qwen2.5-Coder...
2026.03
49.41
Code-A1
Code LLM=Qwen2.5-Coder...
2026.03
45.85
Golden Tests
Code LLM=Qwen2.5-Coder...
2026.03
45.41
Self-Play
Code LLM=Qwen2.5-Coder...
2026.03
45.09
/
Code LLM=Qwen2.5-Coder...
2026.03
41.78
Code-A1
Code LLM=Qwen2.5-Coder...
2026.03
34.82
Golden Tests
Code LLM=Qwen2.5-Coder...
2026.03
34.23
Self-Play
Code LLM=Qwen2.5-Coder...
2026.03
33.47
/
Code LLM=Qwen2.5-Coder...
2026.03
29.34
Feedback
Search any
task
Search any
task