Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

The Benefits of a Concise Chain of Thought on Problem-Solving in Large Language Models

About

In this paper, we introduce Concise Chain-of-Thought (CCoT) prompting. We compared standard CoT and CCoT prompts to see how conciseness impacts response length and correct-answer accuracy. We evaluated this using GPT-3.5 and GPT-4 with a multiple-choice question-and-answer (MCQA) benchmark. CCoT reduced average response length by 48.70% for both GPT-3.5 and GPT-4 while having a negligible impact on problem-solving performance. However, on math problems, GPT-3.5 with CCoT incurs a performance penalty of 27.69%. Overall, CCoT leads to an average per-token cost reduction of 22.67%. All code, data, and supplemental materials are available on GitHub at https://github.com/matthewrenze/jhu-concise-cot

Matthew Renze, Erhan Guven• 2024

Related benchmarks

TaskDatasetResultRank
Mathematical ReasoningGSM8K
Accuracy93.33
499
Mathematical ReasoningMath Benchmarks Aggregate
Accuracy (Avg)81.9
40
Mathematical ReasoningAMC23
Accuracy90.83
18
Mathematical ReasoningMATH
Accuracy92.33
18
Mathematical ReasoningAIME 24
Accuracy51.11
18
Medical Question AnsweringMedical Benchmarks (MedQA, MedMCQA, BULLET) (test)
MedQA Accuracy0.4917
18
Mathematical ReasoningAIME 2025
Accuracy35.33
12
Mathematical ReasoningMATH 500
Accuracy (%)92.26
12
Mathematical ReasoningAIME 2024
Accuracy52.33
12
Mathematical ReasoningMath Benchmarks (GSM8K, MATH, AMC23, AIME24) (test)
Accuracy (GSM8K)96
8
Showing 10 of 10 rows

Other info

Follow for update