Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

BookSum

Benchmarks

Task NameDataset NameSOTA ResultTrend
Long document summarizationBookSum (test)
ROUGE 143.19
37
Watermarking DetectionBOOKSUM (test)
Detection Rate (No Attack)100
24
Watermark DetectionBOOKSUM
TP @ FP=1%100
24
SummarizationBookSum (test)
Comp Score5
24
Document SummarizationBookSum
ROUGE-1 Score46.62
22
Long-context Input (Summarization)BookSum
TPT (s)3.76
20
Spoofing Attack RobustnessBookSum
AUC0.9552
20
Paraphrase Attack RobustnessBookSum
AUC98.49
20
SummarizationBookSum Chapter Level
ROUGE-142.68
14
Language ModelingBookSum
Perplexity19.35
13
Summarization FaithfulnessBookSum
SummaC Score39.84
12
Abstractive SummarizationBookSum sampled (test)
ROUGE Score17.71
12
Faithfulness EvaluationBookSum (test)
SummaC40.71
12
Grounded Payoff TrackingBookSum
Detection Accuracy69.8
12
Narrative ReasoningBookSum oracle timing
Average Score94
12
Text generationBookSum
F1 Score26.5
10
Watermarking EfficiencyBookSum
Total Time (s)1,224.25
10
SummarizationBookSum
Reward0.277
6
SummarizationBookSum Average latest (test)
Average ROUGE17.47
6
SummarizationBookSum Trun. latest (test)
Avg ROUGE16.68
6
SummarizationBookSum No Trun. latest (test)
Average ROUGE17.74
6
Watermarking Token EfficiencyBOOKSUM (test)
Avg Tokens per Sentence186.7
5
Showing 22 of 22 rows