Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

ChartMoE: Mixture of Diversely Aligned Expert Connector for Chart Understanding

About

Automatic chart understanding is crucial for content comprehension and document parsing. Multimodal Large Language Models (MLLMs) have demonstrated remarkable capabilities in chart understanding through domain-specific alignment and fine-tuning. However, current MLLMs still struggle to provide faithful data and reliable analysis only based on charts. To address it, we propose ChartMoE, which employs the Mixture of Expert (MoE) architecture to replace the traditional linear projector to bridge the modality gap. Specifically, we train several linear connectors through distinct alignment tasks, which are utilized as the foundational initialization parameters for different experts. Additionally, we introduce ChartMoE-Align, a dataset with nearly 1 million chart-table-JSON-code quadruples to conduct three alignment tasks (chart-table/JSON/code). Combined with the vanilla connector, we initialize different experts diversely and adopt high-quality knowledge learning to further refine the MoE connector and LLM parameters. Extensive experiments demonstrate the effectiveness of the MoE connector and our initialization strategy, e.g., ChartMoE improves the accuracy of the previous state-of-the-art from 80.48\% to 84.64\% on the ChartQA benchmark.

Zhengzhuo Xu, Bowen Qu, Yiyan Qi, Sinan Du, Chengjin Xu, Chun Yuan, Jian Guo• 2024

Related benchmarks

TaskDatasetResultRank
Chart Question AnsweringChartQA--
371
Chart Question AnsweringChartQA (test)--
190
Chart-based Question AnsweringChartQA Pro
Accuracy29.1
52
Chart UnderstandingCharXiv
Reasoning Score28.3
44
Chart-to-code GenerationChartMimic Chart2Python (test)
Error Rate52.7
38
Chart UnderstandingChartBench
NQA31.48
32
Chart Question AnsweringChartQA augmented
Accuracy90.96
26
Chart-to-code GenerationChart2NCode Chart2LaTeX (test)
Error Rate (ER)17.1
19
Chart-to-code GenerationPlot2Code Chart2Python (test)
Error Rate (ER)70.5
19
Chart-to-code GenerationChart2NCode Chart2R (test)
Error Rate (ER)39.3
18
Showing 10 of 21 rows

Other info

Follow for update