Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Table-LLM-Specialist: Language Model Specialists for Tables using Iterative Generator-Validator Fine-tuning

About

Language models such as GPT and Llama have shown remarkable ability on diverse natural language tasks, yet their performance on complex table tasks (e.g., NL-to-Code and data cleaning) remains suboptimal. Improving performance typically requires task-specific fine-tuning, which depends on expensive human labeling and is prone to overfitting. In this work, we propose Table-LLM-Specialist, a self-trained fine-tuning paradigm designed for table tasks. Our key insight is that many table tasks admit two dual formulations: a generative version and a classification version. Leveraging this duality, we introduce a Generator-Validator paradigm that iteratively generates and validates training data using language models, enabling effective fine-tuning without manually labeled data. Extensive evaluations on Llama, GPT-3.5, and GPT-4 show that Table-LLM-Specialist achieves (1) strong performance across diverse tasks compared to base models, for example, models fine-tuned on GPT-3.5 often surpass GPT-4 level quality; (2) lower deployment cost by enabling smaller models to reach high quality with reduced latency and cost; and (3) better generalization across multiple benchmarks, due to training on diverse, systematically generated data from real-world tables. Our code is available at https://github.com/microsoft/Table-Specialist. Models fine-tuned with Table-LLM-Specialist have been integrated into Microsoft Excel and are deployed in production for automated table data cleaning.

Junjie Xing, Yeye He, Mengyu Zhou, Haoyu Dong, Shi Han, Dongmei Zhang, Surajit Chaudhuri• 2024

Related benchmarks

TaskDatasetResultRank
Text-to-SQLSpider
Exec Acc (All)70.4
91
Text-to-SQLBird
Total Execution Accuracy55.6
64
Text-to-SQLBird
Accuracy47.5
27
NL-to-SQLWikiSQL
Execution Accuracy87.4
18
NL-to-SQLWikiTQ
Execution Accuracy59.7
12
NL-to-SQLText2Analysis
Execution Accuracy57.2
12
Data-transformation (Pandas)TDE
Execution Accuracy45.6
6
Data-transformation (R)TDE
Execution Accuracy31.8
6
Data-transformation (SQL)TDE
Execution Accuracy20.2
6
Data-transformation (SQL)Transform-Text
Execution Accuracy22.7
6
Showing 10 of 31 rows

Other info

Follow for update