| Task Name | Dataset Name | SOTA Result | Trend | |
|---|---|---|---|---|
| Natural Language Processing | T0 MTest11 P3 (test) | Accuracy61.4 | 42 | |
| Natural Language Processing | T0 benchmark | RTE85.8 | 18 | |
| Natural Language Processing | T0 Without SCloze dataset HyperT5 variant (test) | Accuracy60.6 | 14 | |
| Zero-shot Natural Language Understanding | T0 (test) | Accuracy65.5 | 8 | |
| Task Generalization | T0 Taxonomy Evaluation Tasks (val) | OBQA59.1 | 7 | |
| Natural Language Understanding | T0 Evaluation Suite IA3 PEFT (held-out) | RTE71.9 | 6 | |
| Few-shot learning | T0 11B (test) | Avg Test Score74.9 | 6 | |
| Instruction Following | T0 | Accuracy49 | 5 | |
| Instruction Following | T0 Zero-Shot | Accuracy- | 0 |