Let's Sample Step by Step: Adaptive-Consistency for Efficient Reasoning and Coding with LLMs

About

A popular approach for improving the correctness of output from large language models (LLMs) is Self-Consistency - poll the LLM multiple times and output the most frequent solution. Existing Self-Consistency techniques always generate a constant number of samples per question, where a better approach will be to non-uniformly distribute the available budget based on the amount of agreement in the samples generated so far. In response, we introduce Adaptive-Consistency, a cost-efficient, model-agnostic technique that dynamically adjusts the number of samples per question using a lightweight stopping criterion. Our experiments over 17 reasoning and code generation datasets and three LLMs demonstrate that Adaptive-Consistency reduces sample budget by up to 7.9 times with an average accuracy drop of less than 0.1%. Our code and data are available at https://www.sample-step-by-step.info

Pranjal Aggarwal, Aman Madaan, Yiming Yang, Mausam• 2023

Related benchmarks

Task	Dataset	Result
Mathematical Reasoning	GSM8K	Accuracy97.04	499
Mathematical Reasoning	MathQA	Accuracy84.1	354
Mathematical Reasoning	AIME 24	Accuracy86.6	318
Reasoning	GPQA Diamond	Accuracy45.69	185
Mathematical Reasoning	Omni-MATH	Accuracy43	123
Mathematical Reasoning	HMMT25	Accuracy48.8	119
Mathematical Reasoning	AMC 23	Pass@1 Accuracy97	109
Financial Reasoning	FinQA	Accuracy70.4	69
Mathematical Reasoning	AIME25	Accuracy76.7	41
Reasoning	AIME 25	Accuracy76.7	40

Showing 10 of 44 rows

Other info

Follow for update

@wizwand_team Discord