Protecting multimodal large language models against misleading visualizations

About

Visualizations play a pivotal role in daily communication in an increasingly data-driven world. Research on multimodal large language models (MLLMs) for automated chart understanding has accelerated massively, with steady improvements on standard benchmarks. However, for MLLMs to be reliable, they must be robust to misleading visualizations, i.e., charts that distort the underlying data, leading readers to draw inaccurate conclusions. Here, we uncover an important vulnerability: MLLM question-answering (QA) accuracy on misleading visualizations drops on average to the level of the random baseline. To address this, we provide the first comparison of six inference-time methods to improve QA performance on misleading visualizations, without compromising accuracy on non-misleading ones. We find that two methods, table-based QA and redrawing the visualization, are effective, with improvements of up to 19.6 percentage points. We make our code and data available.

Jonathan Tonglet, Tinne Tuytelaars, Marie-Francine Moens, Iryna Gurevych• 2025

Related benchmarks

Task	Dataset	Result
Chart Question Answering	ChartQA	--	371
Chart Visual Question Answering	CHARTOM misleading visualizations	--	23
Chart Visual Question Answering	CHARTOM non-misleading visualizations	--	23
Chart Question Answering	MC v1 (test)	Accuracy49.18	11
Chart Question Answering	CDCC v1 (test)	Accuracy38.18	11

Showing 5 of 5 rows

Other info

Follow for update

@wizwand_team Discord