| Dataset Name | SOTA Method | Metric | Trend | ||
|---|---|---|---|---|---|
| DSPredict Easy (test) | Validation Rate100 | 12 | 3mo ago | ||
| DSPredict Hard (Private test) | Valid Score85.7 | 12 | 3mo ago | ||
| MLEBench lite | Qwen3-Coder | Valid Score100 | 12 | 3mo ago | |
| Turin | PriorGuide | RMSE0.13 | 9 | 1mo ago | |
| OUP | PriorGuide | RMSE0.21 | 9 | 1mo ago | |
| Multisensory Causal Inference | TNP-D | Average LL-2.76 | 6 | 1mo ago |