Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Self-Consistency from Only Two Samples: CoT-PoT Ensembling for Efficient LLM Reasoning

About

Self-consistency (SC) is a popular technique for improving the reasoning accuracy of large language models by aggregating multiple sampled outputs, but it comes at a high computational cost due to extensive sampling. We introduce a hybrid ensembling approach that leverages the complementary strengths of two distinct modes of reasoning: Chain-of-Thought (CoT) and Program-of-Thought (PoT). We describe a general framework for combining these two forms of reasoning in self-consistency, as well as particular strategies for both full sampling and early-stopping. We show that CoT-PoT ensembling not only improves overall accuracy, but also drastically reduces the number of samples required for SC by a factor of 9.3x. In particular, the majority of tasks (78.6%) can be addressed with only two samples, which has not been possible with any prior SC methods.

Raman Saparkhan, Majd Hawasly, Md Rizwan Parvez, Mohammad Raza• 2026

Related benchmarks

TaskDatasetResultRank
Mathematical ReasoningGSM8K
Accuracy (Acc)96.1
337
Mathematical ReasoningTabMWP
Accuracy88.4
203
Financial ReasoningFinQA
Accuracy72.1
69
Mathematical ReasoningGSM8K
Accuracy96.2
7
Mathematical ReasoningMATH
Accuracy76.1
7
Mathematical ReasoningSVAMP
Accuracy95.6
7
Mathematical ReasoningFinQA
Accuracy72.2
7
Table-based Mathematical ReasoningTabMWP
Accuracy88.4
7
Showing 8 of 8 rows

Other info

Follow for update