| Task Name | Dataset Name | SOTA Result | Trend | |
|---|---|---|---|---|
| Chain-of-Thought Reasoning | Reasoning Dataset | Accuracy (Acc)86.9 | 21 | |
| Reasoning | 7 reasoning datasets | Reasoning Accuracy65.74 | 15 | |
| Natural Language Generation | Reasoning | ROUGE-174.23 | 8 | |
| System Performance Evaluation | Reasoning | Throughput194.21 | 8 | |
| Tokenizer compression | Reasoning | Bits per Token3.51 | 5 |