| Task Name | Dataset Name | SOTA Result | Trend | |
|---|---|---|---|---|
| Chinese Multitask Language Understanding | CMMLU | Accuracy81.8 | 67 | |
| Language Understanding | CMMLU | Accuracy90.1 | 62 | |
| General Knowledge | CMMLU | Accuracy89.5 | 50 | |
| Multitask Language Understanding | CMMLU (test) | Accuracy78.3 | 38 | |
| Multilingual Understanding | CMMLU | Score81.5 | 32 | |
| Chinese General Knowledge | CMMLU | Accuracy90.9 | 25 | |
| Knowledge | CMMLU | Knowledge Score84.72 | 25 | |
| Multi-task Language Understanding | CMMLU | Accuracy89.28 | 24 | |
| Examination | CMMLU | Score61.3 | 20 | |
| Chinese Language Knowledge and Reasoning | CMMLU | Score77.01 | 14 | |
| General Language Understanding | CMMLU | Overall Accuracy77.3 | 14 | |
| Comprehensive Examination | CMMLU (test) | Accuracy68.1 | 14 | |
| Chinese Language Understanding | CMMLU (test) | CMMLU Score0.574 | 13 | |
| Chinese Language Understanding | CMMLU | Score90.9 | 10 | |
| General Reasoning | CMMLU (test) | Accuracy64.1 | 8 | |
| Comprehensive cognitive reasoning | CMMLU | Score53.45 | 8 | |
| Knowledge Evaluation | Cmmlu_c | Accuracy36.88 | 7 | |
| Chinese multiple-choice evaluation | CMMLU | CMMLU College Mathematics Accuracy61.9 | 6 | |
| Question Answering | CMMLU | Accuracy88.1 | 6 | |
| Multilingual Knowledge | CMMLU | CMMLU Score71.8 | 6 | |
| Medical Knowledge Evaluation | CMMLU Med | Accuracy86.89 | 5 | |
| Knowledge & Reasoning | CMMLU | Accuracy63.4 | 4 | |
| General Domains | CMMLU | Accuracy0.865 | 4 | |
| Knowledge | CMMLU (test) | Knowledge CMMLU Test Accuracy32.11 | 3 | |
| Knowledge Reasoning | CMMLU c | Accuracy (Normalized)36.08 | 3 |