| Dataset Name | SOTA Method | Metric | Trend | ||
|---|---|---|---|---|---|
| Multi-News (test) | UL20B | ROUGE-221.7 | 45 | 4d ago | |
| WCEP (test) | SimCAS | R-146.29 | 27 | 4d ago | |
| MDS | Length1,681.6 | 14 | 4d ago | ||
| WikiSUM (test) | BASS | ROUGE-144.33 | 14 | 4d ago | |
| WCEP 50 (test) | PRIMERA (reported) | ROUGE-143 | 12 | 4d ago | |
| Multi-News 256 (test) | PRIMERA (reported) | ROUGE-146 | 12 | 4d ago | |
| ICLR metareview motivations (test) | ROUGE-139 | 11 | 4d ago | ||
| Multi-XSci (test) | SOTA(Pointer Generator) | ROUGE-134.11 | 11 | 4d ago | |
| DUC 2004 (test) | PG-MMR | ROUGE-1 Score36.42 | 9 | 4d ago | |
| DUC 2007 (test) | ROUGE-143.426 | 8 | 4d ago | ||
| WCEP10 | DelimScaling (Qwen2.5-7B) | ROUGE-129.77 | 6 | 4d ago | |
| DUC 2007 250 (test) | Centrum | ROUGE-135.3 | 6 | 4d ago | |
| arXiv (test) | PRIMERA | ROUGE-134.6 | 5 | 4d ago | |
| Multi-News (test) | Informativeness150 | 4 | 4d ago | ||
| MDS SPARK (test) | ATTR. FIRST | ROUGE-L21.1 | 3 | 4d ago | |
| ICLR reviews (test) | GLIMPSE | Discriminativeness93.75 | 3 | 4d ago | |
| DUC 2007 (human evaluation) | Centrum | Informativeness65.5 | 3 | 4d ago | |
| MDS human evaluation | ATTR. FIRST | Fluency4.9 | 2 | 4d ago |