Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Faithful Chart Summarization with ChaTS-Pi

About

Chart-to-summary generation can help explore data, communicate insights, and help the visually impaired people. Multi-modal generative models have been used to produce fluent summaries, but they can suffer from factual and perceptual errors. In this work we present CHATS-CRITIC, a reference-free chart summarization metric for scoring faithfulness. CHATS-CRITIC is composed of an image-to-text model to recover the table from a chart, and a tabular entailment model applied to score the summary sentence by sentence. We find that CHATS-CRITIC evaluates the summary quality according to human ratings better than reference-based metrics, either learned or n-gram based, and can be further used to fix candidate summaries by removing not supported sentences. We then introduce CHATS-PI, a chart-to-summary pipeline that leverages CHATS-CRITIC during inference to fix and rank sampled candidates from any chart-summarization model. We evaluate CHATS-PI and CHATS-CRITIC using human raters, establishing state-of-the-art results on two popular chart-to-summary datasets.

Syrine Krichene, Francesco Piccinno, Fangyu Liu, Julian Martin Eisenschlos• 2024

Related benchmarks

TaskDatasetResultRank
Figure CaptioningSciCap First sentence
BLEU15.53
10
Figure CaptioningSciCap Single-Sent Caption
BLEU18
9
Figure CaptioningSciCap Caption w/ <=100 words
BLEU16.16
9
Sentence ClassificationChart-To-Text (test)
Accuracy92.38
8
Chart SummarizationSciCap First sentence
CHATS-CRITIC51.97
4
Figure CaptioningSciCap SciTune info--
2
Showing 6 of 6 rows

Other info

Code

Follow for update