| Task Name | Dataset Name | SOTA Result | Trend | |
|---|---|---|---|---|
| Reasoning evaluation | DialogSum | Reasoning99.1 | 33 | |
| Summarization | DIALOGSUM | ROUGE-L51.6 | 27 | |
| Dialogue Summarization | DialogSum | R-L39.4 | 15 | |
| Summarization | DialogSum 1.5k examples (val) | ROUGE-L39.1 | 11 | |
| Summarization | DIALOGSUM | Std Dev ROUGE-10.83 | 8 | |
| Controllable Summarization | DialogSum | Extent20.45 | 7 | |
| Dialogue Summarization | DialogSum 50 samples (test) | Informativeness4.03 | 3 |