| Task Name | Dataset Name | SOTA Result | Trend | |
|---|---|---|---|---|
| LLM Alignment | SHP | Diversity89.3 | 15 | |
| Direct Preference Optimization | SHP AlpacaEval 2.0 | LCWR18.44 | 14 | |
| Reward Maximization | SHP | Win Rate0.53 | 12 | |
| Binary/Pairwise Classification | SHP | Accuracy69.5 | 9 | |
| Open-ended Dialogue | SHP OOD | Win Rate77.5 | 4 |