| Task Name | Dataset Name | SOTA Result | Trend | |
|---|---|---|---|---|
| Summarization | TL;DR (test) | Win Rate82.5 | 49 | |
| Summarization | TL;DR | Winrate91.8 | 42 | |
| Preference Alignment | TL;DR (test) | Win Rate68.8 | 36 | |
| Summarization | TL;DR (distillation set) | Word Count27.24 | 16 | |
| Reward Modeling | TL;DR Seen (n=100) | Accuracy62.3 | 14 | |
| Summarization | TL;DR | Completeness43 | 11 | |
| Reward Modeling | TL;DR Overall n=150 | Accuracy62.9 | 7 | |
| Reward Modeling | TL;DR Unseen (n=150) | Accuracy62.4 | 7 | |
| Reward Modeling | TL;DR n=150 Seen | Accuracy63.3 | 7 | |
| Reward Modeling | TL;DR n=100 Unseen | Accuracy61.5 | 7 | |
| Summarization | TL;DR | Win Rate92.8 | 6 | |
| Summarization (Groundedness) | TL;DR | Kendall's Tau0.46 | 5 | |
| Summarization (Completeness) | TL;DR | Kendall's Tau0.44 | 5 | |
| Preference Alignment | TL;DR | GRA (%)64.4 | 4 | |
| Summarization | TL;DR | Winrate50.5 | 4 | |
| Summarization Preference Evaluation | TL;DR (val) | Metric- | 0 | |
| Text Summarization | TL;DR (test) | Metric- | 0 |