| Task Name | Dataset Name | SOTA Result | Trend | |
|---|---|---|---|---|
| Bivariate Causal Discovery | D4 s1 | Accuracy79 | 33 | |
| Classification | D4 | Mean Accuracy90.062 | 30 | |
| Bivariate Causal Discovery | D4 s2c | Accuracy64 | 23 | |
| Bivariate Causal Discovery | D4 s2b | Accuracy63 | 23 | |
| Bivariate Causal Discovery | D4 s2a | Accuracy71 | 23 | |
| Medical Image Segmentation | D4 | DSC69.04 | 14 | |
| RANK | D4 V2 | Critical Depth (d50)3 | 12 | |
| Seeker Simulation | D4 | Precision64.73 | 12 | |
| Regression | D4 | RMSE0.2 | 10 | |
| Column Type Annotation | D4-20+ | Micro-F187.3 | 9 | |
| Aspect-level sentiment classification | D4 | Accuracy85.58 | 9 | |
| Regression | D4 | Average Relative MSE0.487 | 7 | |
| Time-Domain Prediction | D4 | NMSE (dB)-9.3 | 6 | |
| Reliability Assessment | D4 (test) | AU-ARC0.9971 | 5 | |
| Frequency-Domain Prediction | D4 | NMSE (dB)-16.87 | 5 | |
| SUCC. | D4 V2 | Critical Depth (d50)4 | 4 | |
| Compositional Reasoning | D4 V2 (test) | Stability (%)89.9 | 4 | |
| CSI Reconstruction | D4 | NMSE (dB)-18.27 | 3 |