Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Scaling Large Language Model-based Multi-Agent Collaboration

About

Recent breakthroughs in large language model-driven autonomous agents have revealed that multi-agent collaboration often surpasses each individual through collective reasoning. Inspired by the neural scaling law--increasing neurons enhances performance, this study explores whether the continuous addition of collaborative agents can yield similar benefits. Technically, we utilize directed acyclic graphs to organize agents into a multi-agent collaboration network (MacNet), upon which their interactive reasoning is topologically orchestrated for autonomous task solving. Extensive evaluations reveal that it effectively supports collaboration among over a thousand agents, with irregular topologies outperforming regular ones. We also identify a collaborative scaling law--the overall performance follows a logistic growth pattern as agents scale, with collaborative emergence occurring earlier than traditional neural emergence. We speculate this may be because scaling agents catalyzes their multidimensional considerations during interactive reflection and refinement, thereby producing more comprehensive artifacts. The code is available at https://github.com/OpenBMB/ChatDev/tree/macnet.

Chen Qian, Zihao Xie, YiFei Wang, Wei Liu, Kunlun Zhu, Hanchen Xia, Yufan Dang, Zhuoyun Du, Weize Chen, Cheng Yang, Zhiyuan Liu, Maosong Sun• 2024

Related benchmarks

TaskDatasetResultRank
Mathematical ReasoningGSM8K
Accuracy87.95
1362
Code GenerationHumanEval
Pass@184.57
1036
Mathematical ReasoningMATH
Accuracy72.1
882
Multi-task Language UnderstandingMMLU
Accuracy98
876
Language UnderstandingMMLU
Accuracy88.1
825
Code GenerationHumanEval (test)
Pass@195.8
506
Mathematical ReasoningGSM8K
Accuracy83.01
499
Multitask Language UnderstandingMMLU
Accuracy84.31
413
Mathematical ReasoningSVAMP
Accuracy88.06
403
Mathematical ReasoningMATH
Accuracy52.9
338
Showing 10 of 52 rows

Other info

Follow for update