Long-Document QA with Chain-of-Structured-Thought and Fine-Tuned SLMs

About

Large language models (LLMs) are widely applied to data analytics over documents, yet direct reasoning over long, noisy documents remains brittle and error-prone. Hence, we study document question answering (QA) that consolidates dispersed evidence into a structured output (e.g., a table, graph, or chunks) to support reliable, verifiable QA. We propose a two-pillar framework, LiteCoST, to achieve both high accuracy and low latency with small language models (SLMs). Pillar 1: Chain-of-Structured-Thought (CoST). We introduce a CoST template, a schema-aware instruction that guides a strong LLM to produce both a step-wise CoST trace and the corresponding structured output. The process induces a minimal structure, normalizes entities/units, aligns records, serializes the output, and verifies/refines it, yielding auditable supervision. Pillar 2: SLM fine-tuning. The compact models are trained on LLM-generated CoST data in two stages: Supervised Fine-Tuning for structural alignment, followed by Group Relative Policy Optimization (GRPO) incorporating triple rewards for answer/format quality and process consistency. By distilling structure-first behavior into SLMs, this approach achieves LLM-comparable quality on multi-domain long-document QA using 3B/7B SLMs, while delivering 2-4x lower latency than GPT-4o and DeepSeek-R1 (671B). The code is available at https://github.com/HKUSTDial/LiteCoST.

Zhuowen Liang, Xiaotian Lin, Zhengxuan Zhang, Yuyu Luo, Haixun Wang, Nan Tang• 2026

Related benchmarks

Task	Dataset	Result
Structured Information Extraction	Loong Finance (test)	Spotlight Locating (AS)83.97	10
Structured output generation for long-document QA	Loong Finance	Spotlight Locating AS83.97	9
Structured Data Extraction and Reasoning	Loong	Spotlight Locating Accuracy (AS)63.23	8
Information Extraction	Loong Legal	Spotlight Locating Accuracy62.2	7
Long-Document Question Answering	LongBench	NarQA Score30.4	6

Showing 5 of 5 rows

Other info

Follow for update

@wizwand_team Discord