Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

The Benefits of a Concise Chain of Thought on Problem-Solving in Large Language Models

About

In this paper, we introduce Concise Chain-of-Thought (CCoT) prompting. We compared standard CoT and CCoT prompts to see how conciseness impacts response length and correct-answer accuracy. We evaluated this using GPT-3.5 and GPT-4 with a multiple-choice question-and-answer (MCQA) benchmark. CCoT reduced average response length by 48.70% for both GPT-3.5 and GPT-4 while having a negligible impact on problem-solving performance. However, on math problems, GPT-3.5 with CCoT incurs a performance penalty of 27.69%. Overall, CCoT leads to an average per-token cost reduction of 22.67%. All code, data, and supplemental materials are available on GitHub at https://github.com/matthewrenze/jhu-concise-cot

Matthew Renze, Erhan Guven• 2024

Related benchmarks

TaskDatasetResultRank
Mathematical ReasoningGSM8K
Accuracy93.33
351
Mathematical ReasoningAMC23
Accuracy90.83
18
Mathematical ReasoningMath Benchmarks Aggregate
Accuracy (Avg)81.9
18
Mathematical ReasoningMATH
Accuracy92.33
18
Mathematical ReasoningAIME 24
Accuracy51.11
18
Medical Question AnsweringMedical Benchmarks (MedQA, MedMCQA, BULLET) (test)
MedQA Accuracy0.4917
18
Mathematical ReasoningMath Benchmarks (GSM8K, MATH, AMC23, AIME24) (test)
Accuracy (GSM8K)96
8
Showing 7 of 7 rows

Other info

Follow for update