Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Revisiting LLM Reasoning via Information Bottleneck

About

Large language models (LLMs) have recently demonstrated remarkable progress in reasoning capabilities through reinforcement learning with verifiable rewards (RLVR). By leveraging simple rule-based rewards, RL effectively incentivizes LLMs to produce extended chain-of-thought (CoT) reasoning trajectories, progressively guiding them toward correct answers. However, existing approaches remain largely heuristic and intuition-driven, limiting the development of principled methodologies. In this paper, we present a theoretical characterization of LLM reasoning grounded in information bottleneck (IB) principle, introducing IB-aware reasoning optimization (IBRO), a framework that encourages reasoning trajectories to be both informative about the final correct answer and generalizable across diverse prompts. We derive a practical token-level surrogate objective and propose an efficient approximation, resulting in the lightweight IB regularization method. This technique integrates seamlessly into existing RL-based post-training frameworks without additional computational overhead, requiring only a one-line code modification. Empirically, we validate IB regularization across multiple mathematical reasoning benchmarks and RL algorithms, demonstrating consistent improvements in LLM reasoning performance.

Shiye Lei, Zhihao Cheng, Kai Jia, Dacheng Tao• 2025

Related benchmarks

TaskDatasetResultRank
Mathematical ReasoningAIME 25
Accuracy15.7
112
Instruction FollowingIFEval
Accuracy (IFEval)54.3
89
Science ReasoningGPQA
Accuracy (GPQA)44.7
72
MathematicsAIME 25
Avg@3214.5
20
MathematicsAIME 24
Avg@320.169
20
Comprehensive EvaluationOverall Across Benchmarks
Avg@32 Accuracy41.6
16
InstructionIFEval
Avg@32 Accuracy44.7
16
MathematicsMATH 500
Accuracy (avg@32)82
16
MathematicsAMC 23
Avg@32 Accuracy55.3
16
MathematicsAMC 24
Accuracy (avg@32)39.5
16
Showing 10 of 12 rows

Other info

Follow for update