Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

TableLLM: Enabling Tabular Data Manipulation by LLMs in Real Office Usage Scenarios

About

We introduce TableLLM, a robust large language model (LLM) with 8 billion parameters, purpose-built for proficiently handling tabular data manipulation tasks, whether they are embedded within documents or spreadsheets, catering to real-world office scenarios. We propose a distant supervision method for training, which comprises a reasoning process extension strategy, aiding in training LLMs to understand reasoning patterns more effectively as well as a cross-way validation strategy, ensuring the quality of the automatically generated data. To evaluate the performance of TableLLM, we have crafted benchmarks tailored to address both document and spreadsheet formats as well as constructed a well-organized evaluation pipeline capable of handling both scenarios. Thorough evaluations underscore the advantages of TableLLM when compared to various existing general-purpose and tabular data-focused LLMs. We have publicly released the model checkpoint, source code, benchmarks, and a web application for user interaction. Our codes and data are publicly available at https://github.com/TableLLM/TableLLM.

Xiaokang Zhang, Sijia Luo, Bohan Zhang, Zeyao Ma, Jing Zhang, Yang Li, Guanlin Li, Zijun Yao, Kangli Xu, Jinchang Zhou, Daniel Zhang-Li, Jifan Yu, Shu Zhao, Juanzi Li, Jie Tang• 2024

Related benchmarks

TaskDatasetResultRank
Instruction FollowingIFEval
IFEval Accuracy25.71
625
Table Question AnsweringHiTab
Accuracy45.4
121
Table Question AnsweringWikiTQ
Accuracy53.59
118
Table Fact VerificationTabFact
Accuracy0.6981
104
Table Question AnsweringWTQ
Accuracy31.74
101
Table Question AnsweringTabMWP
Accuracy42.24
79
Table Question AnsweringAIT-QA
Accuracy79.08
58
Table-based Fact VerificationTabFact
Accuracy29.24
49
Tabular Question AnsweringReasonTabQA 1.0 (Overall)
Overall Accuracy13.5
33
Language UnderstandingMMLU
Humanities Avg12.55
33
Showing 10 of 20 rows

Other info

Code

Follow for update