| Task Name | Dataset Name | SOTA Result | Trend | |
|---|---|---|---|---|
| Code Generation | DS-1000 | Pass@158.65 | 28 | |
| Code Generation | DS-1000 1.0 (test) | Matplotlib67.2 | 19 | |
| Code Hallucination Detection | DS-1000 | OP0.95 | 16 | |
| Code Generation | DS-1000 | Matplotlib Score68.6 | 15 | |
| Data science code generation | DS-1000 | Matplotlib Score60.3 | 13 | |
| Code Generation | DS-1000 | Accuracy69.9 | 11 | |
| Data Science Code Completion | DS-1000 | Pandas (Pass@1)32 | 9 | |
| Data Science | DS-1000 | Performance Score56.5 | 8 | |
| Code Completion | DS-1000 (test) | Matplotlib Success Rate56.1 | 8 | |
| Code Generation | DS-1000 Python | Pass@139.7 | 7 | |
| Code Completion | DS-1000 1.0 (test) | Success Rate (Matplotlib)55.2 | 5 | |
| Code Suggestion | DS-1000 | Binary Reward51 | 4 | |
| Code Test Generation | DS-1000 | Source Pass Rate68.12 | 4 | |
| Code Insertion | DS-1000 1.0 (test) | Success Rate (Matplotlib)55.2 | 3 |