| Task Name | Dataset Name | SOTA Result | Trend | |
|---|---|---|---|---|
| Reward Modeling | HHH-Alignment Reversed | Accuracy86.2 | 9 | |
| Reward Modeling | HHH-Alignment Standard | Accuracy91.8 | 9 | |
| Reward Modeling | HHH-Alignment (OOD) | Accuracy79.8 | 8 | |
| Reward Modeling | HHH-Alignment OOD (test) | Score78.7 | 8 | |
| Reward Modeling | HHH Alignment | Accuracy87.8 | 4 |