More Agents Is All You Need

About

We find that, simply via a sampling-and-voting method, the performance of large language models (LLMs) scales with the number of agents instantiated. Also, this method, termed as Agent Forest, is orthogonal to existing complicated methods to further enhance LLMs, while the degree of enhancement is correlated to the task difficulty. We conduct comprehensive experiments on a wide range of LLM benchmarks to verify the presence of our finding, and to study the properties that can facilitate its occurrence. Our code is publicly available at: https://github.com/MoreAgentsIsAllYouNeed/AgentForest

Junyou Li, Qin Zhang, Yangbin Yu, Qiang Fu, Deheng Ye• 2024

Related benchmarks

Task	Dataset	Result
Instruction Following	AlpacaEval	Win Rate40.5	420
Multi-task Language Understanding	MMLU	Accuracy90.47	353
Text Classification	AG News (test)	Accuracy82.47	293
Arithmetic Reasoning	GSM8K	Accuracy86.8	272
Commonsense Reasoning	CSQA	CSQA Accuracy87.63	195
Arithmetic Reasoning	GSM8K (test)	Accuracy77.4	189
Text Classification	TREC (test)	Accuracy73.2	122
Question Answering	ScienceQA	Accuracy71.53	96
Mathematical Reasoning	MAWPS (test)	Accuracy92.4	87
Multi-task Language Understanding	MMLU (test)	Normalized Accuracy60.92	87

Showing 10 of 30 rows

Other info

Follow for update

@wizwand_team Discord