Structured Reasoning for Large Language Models

About

Large language models (LLMs) achieve strong performance by generating long chains of thought, but longer traces always introduce redundant or ineffective reasoning steps. One typical behavior is that they often perform unnecessary verification and revisions even if they have reached the correct answers. This limitation stems from the unstructured nature of reasoning trajectories and the lack of targeted supervision for critical reasoning abilities. To address this, we propose Structured Reasoning (SCR), a framework that decouples reasoning trajectories into explicit, evaluable, and trainable components. We mainly implement SCR using a Generate-Verify-Revise paradigm. Specifically, we construct structured training data and apply Dynamic Termination Supervision to guide the model in deciding when to terminate reasoning. To avoid interference between learning signals for different reasoning abilities, we adopt a progressive two-stage reinforcement learning strategy: the first stage targets initial generation and self-verification, and the second stage focuses on revision. Extensive experiments on three backbone models show that SCR substantially improves reasoning efficiency and self-verification. Besides, compared with existing reasoning paradigms, it reduces output token length by up to 50%.

Jinyi Han, Zixiang Di, Zishang Jiang, Ying Liao, Jiaqing Liang, Yongqi Wang, Yanghua Xiao• 2026

Related benchmarks

Task	Dataset	Result
Mathematical Reasoning	MATH500	Pass@180	77
Multi-task Language Understanding	MMLU-Pro	Pass@156.3	64
Aggregate Model Performance	Combined Benchmark Suite	Average Score48.03	57
Expert-Level Question Answering	GPQA Diamond	Pass@139.39	39
Question Answering	ARC	pass@186.43	30
Mathematical Reasoning	AIME 2025	Avg@1013.67	21
Mathematical Reasoning	AIME 2024	Avg@10 Recall13.67	21
Mathematical Reasoning	AMC	Avg@10 Score50.12	21

Showing 8 of 8 rows

Other info

Follow for update

@wizwand_team Discord