Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

SummEval

Benchmarks

Task NameDataset NameSOTA ResultTrend
Summarization EvaluationSummEval
Coherence57
41
Summarization EvaluationSummEval
Avg Spearman Rho0.6
40
Factual Consistency EvaluationSummEval
Spearman Correlation46.6
36
Factual Consistency EvaluationSummEval (test)
Pearson CC66.3
22
Summarization EvaluationSummEval 1.0 (test)
Coherence (Spearman rho)0.5944
21
Comparative AssessmentSummEval
Coherence Accuracy68.9
18
Text Quality Meta-evaluationSummEval (Local)
Coherence0.687
16
Text SummarizationSummEval Global
Coherence85.2
16
Fact-checkingSummEval
Balanced Accuracy77.3
15
Opinion SummarizationSUMMEVAL-OP 1.0 (Round-II)
FL (Fluency)5
13
SummarizationSummEval
Completeness0.72
11
Summarization Meta-evaluationSummEval (test)
Coherence (Pearson r)0.668
11
Text Summarization EvaluationSummEval (test)
Coherence (Spearman ρ)0.575
10
Meta-evaluationSummEval
Spearman Correlation (COH)0.448
10
Summarization EvaluationSummEval
MSE0.495
8
Factual Consistency EvaluationSummEval
Pearson CC66.7
8
Factual Consistency EvaluationSummEval
Kendall's Tau38.4
8
Summarization EvaluationSummEval Relevance Domain
Corr.0.96
8
Document CoherenceSUMMEVAL (test)
Accuracy67.19
8
Summarization EvaluationSummEval
Relevance (theta_ratio)1.55
7
Pairwise ComparisonSummEval (anchor set)
Accuracy94.5
6
Hallucination DetectionSummEval (test)
Accuracy71.5
5
Summarization (Groundedness)SummEval
Kendall's Tau0.65
5
Text SummarizationSummEval
Avg Spearman Corr0.474
3
SummarizationSummEval
Attribute Score (Before)18.3
3
Showing 25 of 25 rows