| Task Name | Dataset Name | SOTA Result | Trend | |
|---|---|---|---|---|
| Classification | D1 | Mean Accuracy99.291 | 30 | |
| Time Series Forecasting | Synthetic D1 (test) | MSE0.512 | 16 | |
| Medical Image Segmentation | D1 Pediatric Organs in CT | DSC85.59 | 14 | |
| Regression | D1 | Average Relative MSE0.038 | 10 | |
| Classification | D1 0.15 (test) | Mean Accuracy99.742 | 10 | |
| Outlier Detection | D1 with only clusteriers (test) | AUC83.3 | 9 | |
| Aspect-level sentiment classification | D1 | Accuracy79.11 | 9 | |
| Segmentation | D1 trained on D0 (test) | Dice93.67 | 7 | |
| Segmentation | D1 evaluated after training on D2 | Dice Score93.67 | 7 | |
| Root Cause Localization | D1 (complete data conditions) | Top-1 Score82.1 | 7 | |
| Missing Data Imputation | D1 (test) | Micro AUPRC81.64 | 7 | |
| Hospital readmission prediction | D1 (test) | Mean AUPRC21.51 | 7 | |
| Failure Triage | D1 complete data conditions | Precision94.6 | 6 | |
| Anomaly Detection | D1 complete data conditions | Precision92.5 | 6 | |
| Time-Domain Prediction | D1 | NMSE (dB)-19.5 | 6 | |
| Reliability Assessment | D1 (test) | AU-ARC91.96 | 5 | |
| Frequency-Domain Prediction | D1 | NMSE (dB)-18.43 | 5 | |
| People counting | D1 seen environment (70%-30%) | Average Precision (AP)0.838 | 4 | |
| CSI Reconstruction | D1 | NMSE (dB)-19.26 | 3 | |
| Root Cause Localization | D1 (test) | Execution Time (s)21.52 | 2 | |
| Failure Triage | D1 (test) | Execution Time (s)1.56 | 2 | |
| Anomaly Detection | D1 (test) | Execution Time (s)5.23 | 2 | |
| Document Question Answering | D1 | Correct Answers20 | 2 | |
| Visual Question Answering | D1 | Effective Answer Rate (C+P)50 | 2 | |
| Page Localization | D1 | Page Localization Success Rate1 | 1 |