| Dataset Name | SOTA Method | Metric | Trend | ||
|---|---|---|---|---|---|
| SAMSum (test) | InstructDS | ROUGE-233 | 80 | 4d ago | |
| SAMSum 1.0 (test) | R151 | 11 | 4d ago | ||
| SAMSum | PEGASUS-2B (calibrated) | ROUGE-229.88 | 10 | 4d ago | |
| AMI (test) | SUMM^N | Conciseness4.13 | 9 | 4d ago | |
| TODSum (test) | InstructDS | ROUGE-189.2 | 7 | 4d ago | |
| TODSum | InstructDS | ROUGE-189.3 | 7 | 4d ago | |
| DialogSum | InstructDS | R-147.8 | 7 | 4d ago | |
| SAMSum 200 samples (test) | ChatGPT | Faithfulness4.94 | 6 | 4d ago | |
| SAMSum 30 samples (test) | ChatGPT | Faithfulness4.52 | 6 | 4d ago | |
| TweetSumm (test) | DIONYSUS | ROUGE-130.7 | 6 | 4d ago | |
| Email (test) | DIONYSUS | ROUGE-1 Score28.9 | 6 | 4d ago | |
| Reddit (test) | DIONYSUS | ROUGE-124.8 | 6 | 4d ago | |
| TVMegaSite | BART-LS | ROUGE-151.8 | 6 | 4d ago | |
| SAMSum All-possible Names (test) | Ins | R228.44 | 4 | 4d ago | |
| SAMSum In-distribution Names (test) | Ins | R228.79 | 4 | 4d ago | |
| DialogSum 50 samples (test) | Informativeness4.03 | 3 | 4d ago | ||
| SAMSum 50 samples (test) | Informativeness4 | 3 | 4d ago | ||
| ICSI (test) | SUMM^N | Readability4.12 | 2 | 4d ago |