| Task Name | Dataset Name | SOTA Result | Trend | |
|---|---|---|---|---|
| Causal Perturbation Prediction | SCM dataset linear SCMs (test) | W29.82 | 10 | |
| Regression | SCM20d | Epsilon0.0074 | 9 | |
| Prediction | SCM marginal shift | ROC AUC1 | 9 | |
| Binary Classification | SCM spurious shift (test) | ROC AUC0.733 | 9 | |
| Sample generation | scm20d | Standardized Energy Distance10 | 7 | |
| Sample generation | scm1d | Standardized energy distance10 | 7 | |
| Multi-Target Regression | SCM20d | Running time (s)79.9444 | 5 | |
| Multi-Target Regression | SCM20d | Model size (MB)275.8278 | 5 | |
| Multi-Target Regression | SCM1d | Model Size (MB)660.5839 | 5 | |
| Multivariate Regression Uncertainty Quantification | scm20d | Coverage (%)99.3 | 4 | |
| Multivariate Regression | scm20d | Coverage90.4 | 4 | |
| Multivariate Uncertainty Quantification | scm20d | Normalized Volume4.25 | 4 | |
| Multivariate Uncertainty Quantification | scm1d | Normalized Volume4.43 | 4 | |
| Multivariate Regression | scm20d | Normalized Volume3.45 | 4 | |
| Long-running Conversational Memory | SCM | Answer Accuracy88.4 | 1 |