AgentVerse: Facilitating Multi-Agent Collaboration and Exploring Emergent Behaviors
About
Autonomous agents empowered by Large Language Models (LLMs) have undergone significant improvements, enabling them to generalize across a broad spectrum of tasks. However, in real-world scenarios, cooperation among individuals is often required to enhance the efficiency and effectiveness of task accomplishment. Hence, inspired by human group dynamics, we propose a multi-agent framework \framework that can collaboratively and dynamically adjust its composition as a greater-than-the-sum-of-its-parts system. Our experiments demonstrate that \framework framework can effectively deploy multi-agent groups that outperform a single agent. Furthermore, we delve into the emergence of social behaviors among individual agents within a group during collaborative task accomplishment. In view of these behaviors, we discuss some possible strategies to leverage positive ones and mitigate negative ones for improving the collaborative potential of multi-agent groups. Our codes for \framework will soon be released at \url{https://github.com/OpenBMB/AgentVerse}.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Mathematical Reasoning | GSM8K | Accuracy89.91 | 983 | |
| Code Generation | HumanEval | Pass@196.84 | 850 | |
| Multi-task Language Understanding | MMLU | Accuracy78.36 | 842 | |
| Mathematical Reasoning | MATH | Accuracy55.6 | 643 | |
| Mathematical Reasoning | MATH | Accuracy54.5 | 535 | |
| Mathematical Reasoning | SVAMP | Accuracy89.64 | 368 | |
| Mathematical Reasoning | GSM8K | Accuracy (GSM8K)93.4 | 358 | |
| Question Answering | GPQA | Accuracy40.2 | 258 | |
| Code Generation | HumanEval+ | -- | 189 | |
| Code Generation | MBPP | Accuracy (%)82.4 | 146 |