Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

SummEval

Benchmarks

Task NameDataset NameSOTA ResultTrend
Summarization EvaluationSummEval
Avg Spearman Rho0.902
45
Summarization EvaluationSummEval
Coherence57
41
Summarization EvaluationSummEval
Pearson Correlation0.546
40
Factual Consistency EvaluationSummEval
Spearman Correlation46.6
36
Factual Consistency EvaluationSummEval (test)
Pearson CC66.3
22
Summarization EvaluationSummEval 1.0 (test)
Coherence (Spearman rho)0.5944
21
Comparative AssessmentSummEval
Coherence Accuracy68.9
18
Text Quality Meta-evaluationSummEval (Local)
Coherence0.687
16
Text SummarizationSummEval Global
Coherence85.2
16
Fact-checkingSummEval
Balanced Accuracy77.3
15
Opinion SummarizationSUMMEVAL-OP 1.0 (Round-II)
FL (Fluency)5
13
SummarizationSummEval
Completeness0.72
11
Summarization Meta-evaluationSummEval (test)
Coherence (Pearson r)0.668
11
RelevancySummEval Rel
Spearman's Rho0.48
10
FaithfulnessSummEval
Spearman's Rho0.676
10
Text Summarization EvaluationSummEval (test)
Coherence (Spearman ρ)0.575
10
Meta-evaluationSummEval
Spearman Correlation (COH)0.448
10
Coherence EvaluationSummEval
Accuracy55.9
8
Summarization EvaluationSummEval
MSE0.495
8
Factual Consistency EvaluationSummEval
Pearson CC66.7
8
Factual Consistency EvaluationSummEval
Kendall's Tau38.4
8
Summarization EvaluationSummEval Relevance Domain
Corr.0.96
8
Document CoherenceSUMMEVAL (test)
Accuracy67.19
8
Summarization EvaluationSummEval
Relevance (theta_ratio)1.55
7
Pairwise ComparisonSummEval (anchor set)
Accuracy94.5
6
Showing 25 of 32 rows