Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Probing How Scalable Table Data Enhances General Long-Context Reasoning

About

As real-world tasks grow increasingly complex, long-context reasoning has become a core capability for Large Language Models (LLMs). However, few studies explore which data types are effective for long-context reasoning and why. We find that structured table data with periodic structures shows strong potential for long-context reasoning. Motivated by this observation, we mathematically analyze tabular dependency structures using mutual information, revealing periodic non-vanishing dependencies in table data. Furthermore, we systematically analyze the capabilities of structured table data, conduct relevant scaling experiments, and validate its underlying mechanisms for enhancing long-context reasoning, yielding several meaningful insights. Leveraging these insights, we propose a simple yet scalable pipeline(TableLong) for synthesizing high-quality, diverse, and verifiable structured table data to boost long-context reasoning via RL. Extensive experimental results demonstrate that table data significantly enhances the long-context reasoning capability of LLMs across multiple long-context benchmarks (+8.24\% on average), and even improves performance on out-of-domain benchmarks (+8.06\% on average). We hope that our insights provide practical guidance for effective post-training data to enhance long-context reasoning in LLMs.

Huaibing Xie, Guoliang Zhao, Yang Liu, Shihan Dou, Siming Huang, Yanling Xiao, Shaolei Wang, Yiting Liu, Cheng Zhang, Shaofan Liu, Pluto Zhou• 2026

Related benchmarks

TaskDatasetResultRank
Code ReasoningLiveCodeBench
Accuracy58.68
62
Long-context ReasoningLongBench v2--
48
Long-context retrieval and synthetic reasoningRULER
Accuracy80.72
47
Science ReasoningGPQA Diamond
Accuracy63.64
34
Long-context UnderstandingMRCR
Accuracy42.66
15
Long-context ReasoningBrowsCompLong
Accuracy74.31
11
Long-Context Mathematical ReasoningGSM-Infinite
Accuracy23.4
11
Long-context ReasoningLoong
Accuracy45.3
11
Long-context ReasoningOolong-Synth
Accuracy51.41
11
Multi-turn Dialogue ReasoningMultiChallenge
Accuracy32.97
4
Showing 10 of 10 rows

Other info

Follow for update