Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

BillSum

Benchmarks

Task NameDataset NameSOTA ResultTrend
SummarizationBillSum
Accuracy69.6
28
Text SummarizationBillSum (test)
Coherence97.3
11
Plain SummarizationBillSum
ROUGE-146.7
9
Discrimination between Good Faith and Problematic agents (Summarization)BillSum 9.3:1
Cohen's d5.91
6
Abstractive SummarizationBillSum
ROUGE-159.67
6
Abstractive SummarizationBillSum 24k samples (test)
ROUGE-157.31
5
Text SimplificationBillSum 500 samples (human evaluation)
Coherence4.3
4
Faithfulness discriminationBillSum
AUC73.2
4
Showing 8 of 8 rows