Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

CNN Dailymail

Benchmarks

Task NameDataset NameSOTA ResultTrend
Abstractive SummarizationCNN/DailyMail full length F-1 (test)
ROUGE-141.69
48
Open ended generationCNN DailyMail
ROUGE-L24.3
40
Language GenerationCNN/DailyMail
Accuracy27.16
35
Uncertainty QuantificationCNN/DailyMail
Hamming AUC0.745
28
SummarizationCNN/DailyMail
Hamming Score-0.276
28
Abstractive SummarizationCNN/DailyMail
ROUGE-144.51
25
SummarizationCNN/DailyMail (test)
1st Metric44.16
22
Abstractive SummarizationCNN/DailyMail Summarization
Hamming Distance1.597
20
Length-Constrained Text GenerationCNN/DailyMail
Win Rate16.43
10
Text GenerationCNN/DailyMail (test)
LCTG Error Rate (E)3.18
10
Text SummarizationCNN/DailyMail (test)
ROUGE-133.23
9
Context AttributionCNN Dailymail (1000 examples)
Log Probability Drop1.48
9
News SummarizationCNN DailyMail
BLEU5.41
8
Extractive SummarizationCNN/DailyMail anonymized (test)
ROUGE-142.69
8
Text SummarizationCNN DailyMail
ROUGE-138.58
7
SummarizationCNN/DailyMail
Distribution Time (s)4.68
7
SummarizationCNN/DailyMail 50 document sample (sampled)
PPL0.3
7
SummarizationCNN/DailyMail (evaluation)
ROUGE-144.45
7
Text SummarizationCNN/DailyMail 100-example subset (ACU protocol) (test)
ACU0.4421
6
SummarizationCNN/DailyMail human evaluation (100 samples)
Relevance Score43
6
Abstractive SummarizationCNN/DailyMail
Baseline Throughput (samples/s)3.4
5
Abstractive Text SummarizationCNN/DailyMail
QA Score56.1
4
SummarizationCNN/DailyMail random subset
Non-Redundancy159
4
Entailment ClassificationCNN DailyMail (test)
Avg Entailment Probability91.2
2
Showing 24 of 24 rows