| Task Name | Dataset Name | SOTA Result | Trend | |
|---|---|---|---|---|
| Instruction Following | IFBench | Pass@1 (Strict)68.1 | 68 | |
| Instruction Following | IFBench | Accuracy67.77 | 25 | |
| Reward Modeling | IFBench | Accuracy69.3 | 17 | |
| Reward Modeling | IFBench Hard | Accuracy78 | 16 | |
| Reward Modeling | IFBench Normal | Accuracy80.5 | 16 | |
| Reward Modeling | IFBench Simple | Accuracy87.2 | 16 | |
| Instruction Following | IFBench | IFBench Score43.28 | 12 | |
| Alignment | IFBench | pass@141.7 | 7 | |
| Reward Modeling | IFBench (test) | Accuracy57.9 | 7 | |
| Instruction Following | IFBench (test) | Score38.61 | 5 | |
| Instruction Following | IFBench Strict | Avg@1031.5 | 2 |