| Dataset Name | SOTA Method | Metric | Trend | ||
|---|---|---|---|---|---|
| Bank Marketing | MostlyAI | F1 Score88.5 | 15 | 2mo ago | |
| Boston Housing | HELIX | RMSE1.747 | 7 | 2mo ago | |
| Adult Income | HELIX | F1 Score82.07 | 7 | 2mo ago | |
| Timely-Eval | TimelyLM-8B | Leaf Classification Accuracy0.939 | 7 | 3mo ago | |
| FLAME 1.0 (test) | Clustering Score100 | 6 | 14d ago | ||
| Machine Learning benchmark Competency-level accuracy | deepseek-r1-distill-qwen-32b | Clustering Accuracy88 | 6 | 14d ago | |
| Transparent Conductors | HELIX | RMSE0.049 | 6 | 2mo ago | |
| MMLU Machine Learning 1.0 (test) | TextGrad | Accuracy88.4 | 4 | 3mo ago | |
| NanoGPT Speedrun Competition | NanoGPT Score96.8 | 2 | 3mo ago |