| Dataset Name | SOTA Method | Metric | Trend | ||
|---|---|---|---|---|---|
| LiveCodeBench | Accuracy87.4 | 90 | 15d ago | ||
| HumanEval | DeepSeek-R1-Distill-Qwen-14B (Reasoning) | HumanEval Score95.73 | 62 | 1d ago | |
| CRUXEval-O | Kimi-K2 Base | Accuracy83.5 | 61 | 1d ago | |
| CRUXEval | Input-CoT Accuracy98.8 | 56 | 2mo ago | ||
| CRUXEval | PPoT + Qwen2.5 Coder | Accuracy76.87 | 36 | 6d ago | |
| HumanE | Denser | Accuracy84.9 | 35 | 3mo ago | |
| MBPP | MBPP Execution Accuracy84.7 | 33 | 7d ago | ||
| LCB v6 | CE-GPPO | Accuracy53.6 | 26 | 7d ago | |
| MBPP | COPT | Accuracy94.55 | 26 | 13d ago | |
| LCB | SCF-RKL | pass@162.46 | 26 | 11d ago | |
| CRUX | RMoA | Accuracy87.37 | 26 | 21d ago | |
| LeetCodeDataset | Hybrid-LoRA | Pass@474.5 | 25 | 14d ago | |
| SuperGPQA Code SGPQA-1k | DFT | Accuracy47.4 | 24 | 1mo ago | |
| R-Bench-T Code | DFT | Accuracy49.91 | 24 | 1mo ago | |
| OJBench | DFT | Accuracy10.34 | 24 | 1mo ago | |
| LiveCodeBench (LCB) | DFT | Accuracy (%)37.82 | 24 | 1mo ago | |
| CRUX official (test) | Pass@1 Accuracy51.1 | 20 | 1mo ago | ||
| Code HumanEval+ LiveCodeBench v5 | Qwen3-4B-Base (HEAL) | HEval+ (Pass@1)79.88 | 18 | 1mo ago | |
| LiveCodeBench 1.0 (test) | A3PO | Accuracy47.2 | 18 | 3mo ago | |
| LCB v5 | MCPO-DAPO | Accuracy33.72 | 16 | 7d ago | |
| KodCode (test) | MoLEM | Accuracy (KodCode Test)74.2 | 13 | 11d ago | |
| CodeForces | LAD | Rating1,533.64 | 12 | 29d ago | |
| HumanEval+ | ResRL | Pass@1697 | 12 | 29d ago | |
| LiveCodeBench | ResRL | Avg@1643.2 | 12 | 29d ago | |
| CruxEval Output | DataFlow-Code-10K | Score51 | 12 | 3mo ago |