| Task Name | Dataset Name | SOTA Result | Trend | |
|---|---|---|---|---|
| Summarization Evaluation | SummEval | Coherence57 | 41 | |
| Summarization Evaluation | SummEval | Avg Spearman Rho0.6 | 40 | |
| Factual Consistency Evaluation | SummEval | Spearman Correlation46.6 | 36 | |
| Factual Consistency Evaluation | SummEval (test) | Pearson CC66.3 | 22 | |
| Summarization Evaluation | SummEval 1.0 (test) | Coherence (Spearman rho)0.5944 | 21 | |
| Comparative Assessment | SummEval | Coherence Accuracy68.9 | 18 | |
| Text Quality Meta-evaluation | SummEval (Local) | Coherence0.687 | 16 | |
| Text Summarization | SummEval Global | Coherence85.2 | 16 | |
| Fact-checking | SummEval | Balanced Accuracy77.3 | 15 | |
| Opinion Summarization | SUMMEVAL-OP 1.0 (Round-II) | FL (Fluency)5 | 13 | |
| Summarization | SummEval | Completeness0.72 | 11 | |
| Summarization Meta-evaluation | SummEval (test) | Coherence (Pearson r)0.668 | 11 | |
| Text Summarization Evaluation | SummEval (test) | Coherence (Spearman ρ)0.575 | 10 | |
| Meta-evaluation | SummEval | Spearman Correlation (COH)0.448 | 10 | |
| Summarization Evaluation | SummEval | MSE0.495 | 8 | |
| Factual Consistency Evaluation | SummEval | Pearson CC66.7 | 8 | |
| Factual Consistency Evaluation | SummEval | Kendall's Tau38.4 | 8 | |
| Summarization Evaluation | SummEval Relevance Domain | Corr.0.96 | 8 | |
| Document Coherence | SUMMEVAL (test) | Accuracy67.19 | 8 | |
| Summarization Evaluation | SummEval | Relevance (theta_ratio)1.55 | 7 | |
| Pairwise Comparison | SummEval (anchor set) | Accuracy94.5 | 6 | |
| Hallucination Detection | SummEval (test) | Accuracy71.5 | 5 | |
| Summarization (Groundedness) | SummEval | Kendall's Tau0.65 | 5 | |
| Text Summarization | SummEval | Avg Spearman Corr0.474 | 3 | |
| Summarization | SummEval | Attribute Score (Before)18.3 | 3 |