| Task Name | Dataset Name | SOTA Result | Trend | |
|---|---|---|---|---|
| Writing | WritingBench | Score85.87 | 74 | |
| Writing | WritingBench v1 (test) | Average Score85.3 | 61 | |
| Instruction Following | WritingBench | Average Score81 | 29 | |
| Instruction Following | WritingBench (Out-of-Domain) | Average Score7.9 | 23 | |
| Open-ended Writing | WritingBench | Score75.76 | 20 | |
| Long-form Writing | WritingBench | Score88.27 | 18 | |
| Creative Writing | WritingBench | Score57.9 | 18 | |
| Controllable writing | WritingBench (WB) | WB-A Score79.8 | 17 | |
| Writing capabilities | WritingBench (test) | Score8.56 | 12 | |
| Writing capability evaluation | WritingBench November 2025 (official leaderboard) | Overall Score83.87 | 9 | |
| Long-form generation | WritingBench | Score5.1 | 6 | |
| Writing and Arena Evaluation | WritingBench | Accuracy87.63 | 3 | |
| Generative Performance | WritingBench | Pearson r0.62 | 1 |