| Task Name | Dataset Name | SOTA Result | Trend | |
|---|---|---|---|---|
| Summarization | Reddit TIFU | ROUGE-115.81 | 10 | |
| Summarization | Reddit TIFU (test) | ROUGE-20.116 | 7 | |
| Discrimination between Good Faith and Problematic agents (Summarization) | Reddit TIFU 16.1:1 | Cohen's d7.23 | 6 | |
| Abstractive Summarization | Reddit TIFU 42k samples (test) | ROUGE-126.63 | 5 | |
| Faithfulness discrimination | Reddit TIFU | AUC77.2 | 4 | |
| Summarization | Reddit TIFU Long (test) | ROUGE-130.31 | 4 | |
| Summarization | Reddit TIFU (evaluation) | ROUGE-130.3 | 3 | |
| Abstractive Summarization | Reddit TIFU | ROUGE-127.99 | 1 |