Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

CodeContests

Benchmarks

Task NameDataset NameSOTA ResultTrend
Code GenerationCodeContests
Pass@189.09
68
Code GenerationCodeContests (test)
Pass@11,200
68
Code GenerationCodeContests
Accuracy21.2
30
Code GenerationCodeContests
Avg@839.33
26
Code GenerationCodeContests official (val)
Pass@443.6
24
Code GenerationCodeContests
Signal38
21
Code GenerationCodeContests
Pass@155.2
21
Code GenerationCodeContests+
LCBv6 Score39.1
15
Code GenerationCodeContests
Accuracy (CC)26.7
15
Efficiency Test GenerationCodeContests C++
ASR (Fast)62.22
8
Efficiency Test GenerationCodeContests Java
Acceptance Success Rate (Fast)63.08
8
Efficiency Test GenerationCodeContests Python
ASR (Fast)60
8
Efficiency-oriented test case generationCodeContests
ASR (Mean)75.82
8
Code GenerationCodeContests (evaluation set)
Pass@119.7
8
Competitive ProgrammingCodeContests (val)
Pass@168.86
6
Program SynthesisCodeContests (test)
Pass@10.2045
6
Coding ReasoningCodecontests
Pass Rate65.8
5
Multi-agent Selection (Pairwise Resolution)CodeContests (test)
Pairwise Resolution89.4
3
Code GenerationCodeContests transfer
Mean F1 Score38.24
3
Competition-Level Code GenerationCodeContests (val)
10@1k21
3
Showing 20 of 20 rows