| Task Name | Dataset Name | SOTA Result | Trend | |
|---|---|---|---|---|
| Triton Kernel Generation | Synthetic Benchmark Overall All Levels | Average Speedup1.57 | 7 | |
| Triton Kernel Generation | Synthetic Benchmark Level 20 | Accuracy99 | 7 | |
| Triton Kernel Generation | Synthetic Benchmark (Level 5) | Acc99 | 7 | |
| Triton Kernel Generation | Synthetic Benchmark Level 2 | Accuracy96 | 7 | |
| Triton Kernel Generation | Synthetic Benchmark Level 1 | Accuracy86.8 | 7 | |
| Shortest Path | synthetic benchmark | Accuracy95 | 7 | |
| Edge Existence | Synthetic Benchmark | Accuracy99.7 | 7 | |
| Node Degree | Synthetic Benchmark | Accuracy99.75 | 7 | |
| Triangle Count | synthetic benchmark | Accuracy74.35 | 7 | |
| Cycle Check | synthetic benchmark | Accuracy99.9 | 7 | |
| Edge Count | Synthetic Benchmark | Accuracy94.95 | 7 | |
| Node Count | synthetic benchmark 1.0 (test) | Accuracy100 | 7 | |
| Learning to Defer | Synthetic benchmark (test) | Test True Risk28.1 | 6 | |
| Bokeh Rendering | Synthetic Benchmark | RMSE0.0133 | 5 | |
| Domain Adaptation | Synthetic Benchmark | Geometry Score58 | 4 | |
| Generative model evaluation metric validation | Synthetic benchmark 2025 (test) | Metric- | 0 |