| Task Name | Dataset Name | SOTA Result | Trend | |
|---|---|---|---|---|
| Summarization | BillSum | Accuracy69.6 | 28 | |
| Text Summarization | BillSum (test) | Coherence97.3 | 11 | |
| Plain Summarization | BillSum | ROUGE-146.7 | 9 | |
| Discrimination between Good Faith and Problematic agents (Summarization) | BillSum 9.3:1 | Cohen's d5.91 | 6 | |
| Abstractive Summarization | BillSum | ROUGE-159.67 | 6 | |
| Abstractive Summarization | BillSum 24k samples (test) | ROUGE-157.31 | 5 | |
| Text Simplification | BillSum 500 samples (human evaluation) | Coherence4.3 | 4 | |
| Faithfulness discrimination | BillSum | AUC73.2 | 4 |