Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

C3oT: Generating Shorter Chain-of-Thought without Compromising Effectiveness

About

Generating Chain-of-Thought (CoT) before deriving the answer can effectively improve the reasoning capabilities of large language models (LLMs) and significantly improve the accuracy of the generated answer. However, in most cases, the length of the generated CoT is much longer than the desired final answer, which results in additional decoding costs. Furthermore, existing research has discovered that shortening the reasoning steps in CoT, even while preserving the key information, diminishes LLMs' abilities. These phenomena make it difficult to use LLMs and CoT in many real-world applications that only require the final answer and are sensitive to latency, such as search and recommendation. To reduce the costs of model decoding and shorten the length of the generated CoT, this paper presents $\textbf{C}$onditioned $\textbf{C}$ompressed $\textbf{C}$hain-of-$\textbf{T}$hought (C3oT), a CoT compression framework that involves a compressor to compress an original longer CoT into a shorter CoT while maintaining key information and interpretability, a conditioned training method to train LLMs with both longer CoT and shorter CoT simultaneously to learn the corresponding relationships between them, and a conditioned inference method to gain the reasoning ability learned from longer CoT by generating shorter CoT. We conduct experiments over four datasets from arithmetic and commonsense scenarios, showing that the proposed method is capable of compressing the length of generated CoT by up to more than 50% without compromising its effectiveness.

Yu Kang, Xianghui Sun, Liangyu Chen, Wei Zou• 2024

Related benchmarks

TaskDatasetResultRank
Mathematical ReasoningGSM8K
Accuracy93.5
351
Multi-discipline Multimodal UnderstandingMMMU
Accuracy59.3
266
Visual Mathematical ReasoningMathVision
Accuracy24.1
63
Mathematical ReasoningMathVision
Accuracy28.8
38
Medical Question AnsweringMedical Benchmarks (MedQA, MedMCQA, BULLET) (test)
MedQA Accuracy0.5533
18
Mathematical ReasoningMATH
Accuracy92
18
Mathematical ReasoningAIME 24
Accuracy51.11
18
Mathematical ReasoningMath Benchmarks Aggregate
Accuracy (Avg)80.78
18
Mathematical ReasoningAMC23
Accuracy87.5
18
Physical ReasoningPhyX
Accuracy43.8
16
Showing 10 of 13 rows

Other info

Follow for update