| Dataset Name | SOTA Method | Metric | Trend | ||
|---|---|---|---|---|---|
| Roleplay 1500 users | Oracle | Winrate90.9 | 10 | 1mo ago | |
| Alpaca-GPT4 Style | F-beta | Win Rate85.3 | 5 | 3mo ago | |
| Alpaca-GPT4 (Expertise) | DPO | Win Rate76.97 | 5 | 3mo ago | |
| Synthetic Color | DPO | Win Rate85.66 | 5 | 3mo ago | |
| HelpSteer2 (test) | STACKELBERGGDA | Avg Pref Score vs QWEN2.5-0.5B0.8 | 5 | 3mo ago |