| Task Name | Dataset Name | SOTA Result | Trend | |
|---|---|---|---|---|
| Alignment Evaluation | Open-ended questions | Win Rate68.9 | 6 | |
| Instruction following | Open-Ended Questions (Human Generated Questions) | Instruction Following Rate70.1 | 2 | |
| Instruction following | Open-Ended Questions GPT-4 Generated Questions | Instruction Following Rate96.9 | 2 | |
| Instruction following | Open-Ended Questions Generated Questions GPT-3.5-Turbo | Instruction Following Rate95.9 | 2 |