Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

S3-CoT: Self-Sampled Succinct Reasoning Enables Efficient Chain-of-Thought LLMs

About

Large language models (LLMs) equipped with chain-of-thought (CoT) achieve strong performance and offer a window into LLM behavior. However, recent evidence suggests that improvements in CoT capabilities often come with redundant reasoning processes, motivating a key question: Can LLMs acquire a fast-thinking mode analogous to human System 1 reasoning? To explore this, our study presents a self-sampling framework based on activation steering for efficient CoT learning. Our method can induce style-aligned and variable-length reasoning traces from target LLMs themselves without any teacher guidance, thereby alleviating a central bottleneck of SFT-based methods-the scarcity of high-quality supervision data. Using filtered data by gold answers, we perform SFT for efficient CoT learning with (i) a human-like dual-cognitive system, and (ii) a progressive compression curriculum. Furthermore, we explore a self-evolution regime in which SFT is driven solely by prediction-consistent data of variable-length variants, eliminating the need for gold answers. Extensive experiments on math benchmarks, together with cross-domain generalization tests in medicine, show that our method yields stable improvements for both general and R1-style LLMs. Our data and model checkpoints can be found at https://github.com/DYR1/S3-CoT.

Yanrui Du, Sendong Zhao, Yibo Gao, Danyang Zhao, Qika Lin, Ming Ma, Jiayun Li, Yi Jiang, Kai He, Qianyi Xu, Bing Qin, Mengling Feng• 2026

Related benchmarks

TaskDatasetResultRank
Mathematical ReasoningGSM8K
Accuracy93.17
351
Mathematical ReasoningAMC23
Accuracy90.83
18
Mathematical ReasoningMath Benchmarks Aggregate
Accuracy (Avg)81.28
18
Medical Question AnsweringMedical Benchmarks (MedQA, MedMCQA, BULLET) (test)
MedQA Accuracy0.545
18
Mathematical ReasoningMATH
Accuracy92
18
Mathematical ReasoningAIME 24
Accuracy51.11
18
Mathematical ReasoningMath Benchmarks (GSM8K, MATH, AMC23, AIME24) (test)
Accuracy (GSM8K)95
8
Showing 7 of 7 rows

Other info

Follow for update