| Dataset Name | SOTA Method | Metric | Trend | ||
|---|---|---|---|---|---|
| MBPP+ | PrivCode | Pass@165.6 | 33 | 4d ago | |
| HumanEval+ | NonDPFT | Pass@156.7 | 33 | 4d ago | |
| Python Code Completion downstream | Exact Similarity (ES)57.7 | 21 | 4d ago | ||
| BigCodeBench Hard | NonDPFT | Pass@116.2 | 20 | 4d ago | |
| BigCodeBench Full | NonDPFT | Pass@146.1 | 20 | 4d ago | |
| HumanEval | NonDPFT | Pass@10.652 | 20 | 4d ago | |
| BigCodeBench | Qwen2.5-Coder-7B | Full Score45.8 | 17 | 4d ago | |
| MBPP Sanitized | OpenCoder-8B-Base | MBPP Pass Rate79.9 | 17 | 4d ago | |
| HumanEval | OpenCoder-8B-Base | HE Score66.5 | 17 | 4d ago | |
| MBPP (test) | 3-Shot Pass Rate69.4 | 16 | 4d ago | ||
| VerilogEval CC v2 (test) | DeepSeek-R1-671B | Pass@179.1 | 11 | 4d ago | |
| RepoBench-P | CoA | Similarity0.7305 | 10 | 4d ago | |
| DS-1000 (test) | DeepSeek-Coder-Base | Matplotlib Success Rate56.1 | 8 | 4d ago | |
| py150 OOD WILDS (test) | CORAL + SDG | MethodClass Accuracy68.3 | 7 | 4d ago | |
| DS-1000 1.0 (test) | WizardCoder | Success Rate (Matplotlib)55.2 | 5 | 4d ago | |
| CrossCodeEval Project-level (test) | CoCoGen | C-EM9.08 | 4 | 4d ago |