Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

CAMEL: Communicative Agents for "Mind" Exploration of Large Language Model Society

About

The rapid advancement of chat-based language models has led to remarkable progress in complex task-solving. However, their success heavily relies on human input to guide the conversation, which can be challenging and time-consuming. This paper explores the potential of building scalable techniques to facilitate autonomous cooperation among communicative agents, and provides insight into their "cognitive" processes. To address the challenges of achieving autonomous cooperation, we propose a novel communicative agent framework named role-playing. Our approach involves using inception prompting to guide chat agents toward task completion while maintaining consistency with human intentions. We showcase how role-playing can be used to generate conversational data for studying the behaviors and capabilities of a society of agents, providing a valuable resource for investigating conversational language models. In particular, we conduct comprehensive studies on instruction-following cooperation in multi-agent settings. Our contributions include introducing a novel communicative agent framework, offering a scalable approach for studying the cooperative behaviors and capabilities of multi-agent systems, and open-sourcing our library to support research on communicative agents and beyond: https://github.com/camel-ai/camel.

Guohao Li, Hasan Abed Al Kader Hammoud, Hani Itani, Dmitrii Khizbullin, Bernard Ghanem• 2023

Related benchmarks

TaskDatasetResultRank
Mathematical ReasoningGSM8K
Accuracy45.6
983
Code GenerationHumanEval
Pass@131.71
850
Mathematical ReasoningMATH
Accuracy22.3
535
Mathematical ReasoningMATH 500
pass@167.4
153
Code GenerationMBPP
Accuracy (%)78.1
146
Mathematical ReasoningGSM8K
EM88.6
115
Science ReasoningGPQA
Pass@111.11
35
Social SimulationSocial Simulation
Configurability2
24
National Policy GenerationNational Policy Generation (test)
Count Agree4
20
Mathematical ReasoningAIME
Pass@16.67
20
Showing 10 of 20 rows

Other info

Follow for update