Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

TableGPT-R1: Advancing Tabular Reasoning Through Reinforcement Learning

About

Tabular data serves as the backbone of modern data analysis and scientific research. While Large Language Models (LLMs) fine-tuned via Supervised Fine-Tuning (SFT) have significantly improved natural language interaction with such structured data, they often fall short in handling the complex, multi-step reasoning and robust code execution required for real-world table tasks. Reinforcement Learning (RL) offers a promising avenue to enhance these capabilities, yet its application in the tabular domain faces three critical hurdles: the scarcity of high-quality agentic trajectories with closed-loop code execution and environment feedback on diverse table structures, the extreme heterogeneity of feedback signals ranging from rigid SQL execution to open-ended data interpretation, and the risk of catastrophic forgetting of general knowledge during vertical specialization. To overcome these challenges and unlock advanced reasoning on complex tables, we introduce \textbf{TableGPT-R1}, a specialized tabular model built on a systematic RL framework. Our approach integrates a comprehensive data engineering pipeline that synthesizes difficulty-stratified agentic trajectories for both supervised alignment and RL rollouts, a task-adaptive reward system that combines rule-based verification with a criteria-injected reward model and incorporates process-level step reward shaping with behavioral regularization, and a multi-stage training framework that progressively stabilizes reasoning before specializing in table-specific tasks. Extensive evaluations demonstrate that TableGPT-R1 achieves state-of-the-art performance on authoritative benchmarks, significantly outperforming baseline models while retaining robust general capabilities. Our model is available at https://huggingface.co/tablegpt/TableGPT-R1.

Saisai Yang, Qingyi Huang, Jing Yuan, Liangyu Zha, Kai Tang, Yuhang Yang, Ning Wang, Yucheng Wei, Liyao Li, Wentao Ye, Hao Chen, Tao Zhang, Junlin Zhou, Haobo Wang, Gang Chen, Junbo Zhao• 2025

Related benchmarks

TaskDatasetResultRank
Text-to-SQLSpider--
57
Chart GenerationRealHitBench
ECR55.84
49
Fact CheckingRealHitBench
Exact Match63.85
49
Structure ComprehendingRealHitBench
Exact Match (EM)64.12
49
Data AnalysisRealHitBench
GPT Score66.53
49
Text-to-SQLBird
Total Execution Accuracy63.17
22
Numerical ReasoningRealHitBench
Exact Match (EM)49.03
21
Agent-based Data AnalysisInfiAgent-DABench
Accuracy80.54
13
Data ProcessingTableBench
Rge48.35
13
Table Chain of Thought ReasoningTableBench
Rge48.28
13
Showing 10 of 14 rows

Other info

Follow for update