Fill in the BLANC: Human-free quality estimation of document summaries
About
We present BLANC, a new approach to the automatic estimation of document summary quality. Our goal is to measure the functional performance of a summary with an objective, reproducible, and fully automated method. Our approach achieves this by measuring the performance boost gained by a pre-trained language model with access to a document summary while carrying out its language understanding task on the document's text. We present evidence that BLANC scores have as good correlation with human evaluations as do the ROUGE family of summary quality measurements. And unlike ROUGE, the BLANC method does not require human-written reference summaries, allowing for fully human-free summary quality estimation.
Oleg Vasilyev, Vedant Dharnidharka, John Bohannon• 2020
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Factual Consistency Evaluation | SummaC | CGS54.1 | 52 | |
| Factual Consistency Evaluation | QAGS XSUM | Spearman Correlation1.6 | 39 | |
| Factual Consistency Evaluation | QAGS CNNDM | Spearman Correlation22.2 | 38 | |
| Factual Consistency Evaluation | TRUE benchmark | PAWS (AUC-ROC)56 | 37 | |
| Factual Consistency Evaluation | SummEval | Spearman Correlation19 | 36 | |
| Opinion Summarization Metric Evaluation | OPINSUMMEVAL | Aspect Relevance56 | 32 | |
| Factual Consistency Evaluation | SamSum | Spearman Correlation9.1 | 30 | |
| Factual Consistency Evaluation | FRANK CNNDM | Spearman Correlation34.2 | 30 | |
| Factual Consistency Evaluation | FRANK-XSum (FRK-X) | Spearman Correlation6.5 | 30 | |
| Factual Consistency Evaluation | FRK-C | Kendall's Tau26 | 22 |
Showing 10 of 31 rows