| Task Name | Dataset Name | SOTA Result | Trend | |
|---|---|---|---|---|
| Multi-turn dialogue | MT-Eval | LLM-EVAL Score8.16 | 20 | |
| Multi-turn Instruction Following | MT-Eval | CSR93.62 | 20 | |
| Multi-turn dialogue evaluation | MT-Eval | Expansion Score7.34 | 9 | |
| Multi-turn conversation | MT-Eval | Accuracy8.28 | 9 | |
| Structured Reasoning and Evaluation | MT-Eval | CSR95.56 | 8 |