Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Let's Sample Step by Step: Adaptive-Consistency for Efficient Reasoning and Coding with LLMs

About

A popular approach for improving the correctness of output from large language models (LLMs) is Self-Consistency - poll the LLM multiple times and output the most frequent solution. Existing Self-Consistency techniques always generate a constant number of samples per question, where a better approach will be to non-uniformly distribute the available budget based on the amount of agreement in the samples generated so far. In response, we introduce Adaptive-Consistency, a cost-efficient, model-agnostic technique that dynamically adjusts the number of samples per question using a lightweight stopping criterion. Our experiments over 17 reasoning and code generation datasets and three LLMs demonstrate that Adaptive-Consistency reduces sample budget by up to 7.9 times with an average accuracy drop of less than 0.1%. Our code and data are available at https://www.sample-step-by-step.info

Pranjal Aggarwal, Aman Madaan, Yiming Yang, Mausam• 2023

Related benchmarks

TaskDatasetResultRank
Mathematical ReasoningGSM8K
Accuracy97.04
351
ReasoningGPQA Diamond
Accuracy45.69
88
Mathematical ReasoningHMMT25
Accuracy48.8
78
Mathematical ReasoningOmni-MATH
Accuracy43
68
ReasoningAIME 25
Accuracy76.7
40
General Knowledge ReasoningMMLU-Pro
Accuracy75.72
31
Mathematical ReasoningMATH500
Acc83.8
30
Science Question AnsweringGPQA
Memory Ratio0.21
24
Mathematical ReasoningAMC
C_mem (Ratio)0.14
24
Mathematical ReasoningMATH500
Memory Usage Ratio18
24
Showing 10 of 23 rows

Other info

Follow for update