| Dataset Name | SOTA Method | Metric | Trend | ||
|---|---|---|---|---|---|
| DUC 2004 (test) | ROUGE-138.57 | 115 | 3mo ago | ||
| CNN/Daily Mail (test) | Top Down Transformer | ROUGE-238.42 | 77 | 20d ago | |
| Gigaword (test) | EncDec+DOC | ROUGE-146.99 | 75 | 3mo ago | |
| DialogueSUM (test) | ROUGE-L57 | 49 | 2mo ago | ||
| CNN/DM | Graft-DFLASH(16) | MAT4.23 | 34 | 13d ago | |
| Text Summarization | XMARK | BA89.5 | 24 | 1mo ago | |
| CNN/DM | PRISM | TPS Score217.02 | 20 | 3mo ago | |
| Summarize | CFA | Average Score67.39 | 18 | 3mo ago | |
| NYT (test) | R1 Score49.18 | 18 | 3mo ago | ||
| CNN/DailyMail | RSBH | BA97.34 | 16 | 21d ago | |
| CNN/DM | 13B (Multi-Task Tuning) | ROUGE-216.95 | 16 | 3mo ago | |
| Text Summarization | PrahokBARTbig | ROUGE-L26.23 | 16 | 3mo ago | |
| SummEval Global | Themis-8B | Coherence85.2 | 16 | 3mo ago | |
| TL;DR | MultiMetric | AlignScore94.2 | 15 | 6d ago | |
| QAGS-X | DIFFSCORE-FT | Pearson Correlation0.248 | 15 | 20d ago | |
| QAGS-C | DIFFSCORE-FT | Pearson Correlation Coefficient0.73 | 15 | 20d ago | |
| Rank19 | AlignScore | ACC84.5 | 15 | 20d ago | |
| Newsroom segment-level | GPTScore | Coherence (COH)0.684 | 15 | 20d ago | |
| SummEval segment-level | UniEval | Coherence60 | 15 | 20d ago | |
| REALSumm system-level | DIFFSCORE | Coverage49.2 | 15 | 20d ago | |
| Text Summarization Long Q, Short A (test) | WorldCup | ROUGE-131.8 | 15 | 3mo ago | |
| Annotated English Gigaword standard (test) | REP(UNI) | ROUGE-139.81 | 15 | 3mo ago | |
| Summary | ProbMoE | LLM-as-judge Score44.4 | 13 | 17h ago | |
| CNN DM | MoE-SpAc | TPS50.17 | 13 | 2mo ago | |
| CNN DailyMail | Poolingformer | ROUGE-138.58 | 13 | 2mo ago |