| Task Name | Dataset Name | SOTA Result | Trend | |
|---|---|---|---|---|
| Causal Inference | BC out-of-sample | sqrt(PEHE)0.69 | 18 | |
| Causal Inference | BC within-sample | sqrt(PEHE)0.73 | 18 | |
| Minority class representation | BC | Minority Class Percentage30.5 | 13 | |
| Speech Quality Assessment | BC 19 | LCC0.87 | 12 | |
| DCR baseline protection analysis | BC | DCR Baseline Protection51.1 | 12 | |
| Membership Inference Attack | BC | Success Rate53.6 | 12 | |
| Synthetic Data Evaluation (Column Pair Trends) | BC | Column Pair Trends Score97.6 | 12 | |
| Overfitting Protection Evaluation | BC | DCR Overfitting Protection0.965 | 12 | |
| Tabular Synthetic Data Generation | BC | Column Shape Score0.987 | 12 | |
| Utility evaluation | BC | Balanced Acc72.1 | 11 | |
| Question Answering | BC | Performance Score6.2 | 8 | |
| Deep Research | BC (test) | Mean Correct Answer Rate620 | 8 | |
| Clustering | BC | Avg Silhouette Score0.557 | 7 | |
| Clustering | BC | ARI77.9 | 7 | |
| Point-level consensus correctness prediction | BC | AUPRC97.1 | 4 | |
| Named Entity Recognition | BC (test) | Average F181.54 | 4 |