| Task Name | Dataset Name | SOTA Result | Trend | |
|---|---|---|---|---|
| Question Answering | WebGPT | Average Score76.42 | 18 | |
| Preference Alignment | WebGPT (test) | Accuracy61.24 | 11 | |
| Direct Preference Optimization | WebGPT | Accuracy58.92 | 11 | |
| Reward Modeling | WebGPT | Accuracy58.4 | 8 | |
| Preference Classification | WebGPT comparisons (test) | Accuracy60.8 | 7 |