| Task Name | Dataset Name | SOTA Result | Trend | |
|---|---|---|---|---|
| Instruction Following | Instruction-Following (val) | ROUGE-L31.1 | 33 | |
| Preference Modeling | Instruction Following | Accuracy65.2 | 20 | |
| Instruction Following | Instruction Following | LC0.2027 | 10 | |
| Reasoning | Instruction following | Normalized Score100 | 9 | |
| Instruction-Following | Instruction-Following Alpaca-V2 Arena-Hard | Alpaca V2 Score9.2 | 6 | |
| Instruction Following | Instruction Following SFT 1.0 (eval) | SFT Score59.4 | 6 |