| Task Name | Dataset Name | SOTA Result | Trend | |
|---|---|---|---|---|
| Concept Extraction Evaluation | 4 classification datasets average | RAcc99.8 | 35 | |
| Zero-shot Classification | Classification Datasets (MMLU, OBQA, ARC-e, WinoGrande, ARC-c, PIQA, HellaSwag) | MMLU (5-shot)37.1 | 18 | |
| Classification | 80 classification datasets | Median Effect Size (F1 pts)0.11 | 17 | |
| Open-Vocabulary Classification | 11 classification datasets (test) | ImageNet Accuracy76.77 | 16 | |
| Classification | medium-sized classification datasets | Accuracy78.58 | 14 | |
| Classification | 6 out-of-domain classification datasets (test) | Accuracy65.2 | 9 | |
| Tabular Data Generation | Classification Datasets | Avg. JSD0.05 | 2 | |
| Classification | 15 Classification Datasets | TabMixNN Wins1,067 | 1 |