| Task Name | Dataset Name | SOTA Result | Trend | |
|---|---|---|---|---|
| Question Answering | Combined 7 Datasets | Average Score45 | 18 | |
| All-in-One Image Restoration | Combined (Deraining, Desnowing, Dehazing) | PSNR34.02 | 13 | |
| Bayesian neural network regression | Combined (test) | RMSE3.939 | 6 | |
| Malicious Prompt Detection | Combined All Datasets (test) | ASR4.5 | 6 | |
| Language Understanding and Reasoning | Combined (GSM8k, MATH500, MAWPS, SVAMP, AQuA, GLUE, CSQA, OBQA) | Average Score72.94 | 5 | |
| Probabilistic Calibration | Combined 20K labeled samples | Brier Score0.0759 | 5 | |
| Data-to-text generation | Combined | FE8.05 | 3 | |
| Shadow Detection | Combined Dataset | Testing Time (hours)0.55 | 3 |