| Task Name | Dataset Name | SOTA Result | Trend | |
|---|---|---|---|---|
| DAG Learning | Synthetic (test) | SID16 | 101 | |
| Email address extraction | Synthetic dataset | Accuracy100 | 70 | |
| System Identification | Synthetic dataset | RE1 | 50 | |
| Regression | Synthetic weakly-periodic Interpolation (INT) | Normalized KL Divergence0.01 | 43 | |
| Fused Matmul and Sampling | Synthetic D=4096, V=151k | Speedup vs Multinomial Sampling1.98 | 36 | |
| Causal Discovery | Synthetic (n=100, |E|=400, sample size=1000) | mAP99.6 | 36 | |
| Causal Discovery | Synthetic n=1000, |E|=2000, sample size=1000 | mAP96.6 | 32 | |
| Participatory Budgeting Rule Evaluation | Synthetic (test) | Omega'[sat^cost]1 | 30 | |
| Bigram Language Modeling | Synthetic WebText initialization (val) | Avg JS0.001 | 30 | |
| Bigram Language Modeling | Synthetic Random 50% initialization (val) | Avg JS Divergence0.0005 | 30 | |
| Causal Discovery | Synthetic Exponential Noise | ABIC Score30.13 | 30 | |
| Unknown sample identification | Synthetic | AUROC0.928 | 29 | |
| Fair Classification | Synthetic 1.0 (test) | Accuracy72.7 | 28 | |
| Optimization | Synthetic quartic function | Gradient Norm (\u00d7 10^-6)5.8 | 27 | |
| K-means Clustering | Synthetic d=3, K=8, N=1600 | WCSS3.718 | 25 | |
| K-means Clustering | Synthetic (d=3, K=8, N=800) | WCSS1.847 | 25 | |
| K-means Clustering | Synthetic (d=3, K=8, N=400) | WCSS0.946 | 25 | |
| K-means Clustering | Synthetic d=3, K=4, N=800 | WCSS5.367 | 25 | |
| K-means Clustering | Synthetic d=3, K=4, N=400 | WCSS2.662 | 25 | |
| K-means Clustering | Synthetic (d=3, K=4, N=200) | WCSS1.319 | 25 | |
| K-means Clustering | Synthetic (d=3, K=2, N=400) | WCSS7.69 | 25 | |
| K-means Clustering | Synthetic (d=3, K=2, N=200) | WCSS3.838 | 25 | |
| K-means Clustering | Synthetic d=3, K=2, N=100 | WCSS1.9 | 25 | |
| K-means Clustering | Synthetic (d=2, K=8, N=1600) | WCSS1.409 | 25 | |
| K-means Clustering | Synthetic d=2, K=8, N=800 | WCSS0.701 | 25 |