Unleashing the Emergent Cognitive Synergy in Large Language Models: A Task-Solving Agent through Multi-Persona Self-Collaboration

About

Human intelligence thrives on cognitive synergy, where collaboration among different minds yield superior outcomes compared to isolated individuals. In this work, we propose Solo Performance Prompting (SPP), which transforms a single LLM into a cognitive synergist by engaging in multi-turn self-collaboration with multiple personas. A cognitive synergist is an intelligent agent that collaboratively combines multiple minds' strengths and knowledge to enhance problem-solving in complex tasks. By dynamically identifying and simulating different personas based on task inputs, SPP unleashes the potential of cognitive synergy in LLMs. Our in-depth analysis shows that assigning multiple fine-grained personas in LLMs improves problem-solving abilities compared to using a single or fixed number of personas. We evaluate SPP on three challenging tasks: Trivia Creative Writing, Codenames Collaborative, and Logic Grid Puzzle, encompassing both knowledge-intensive and reasoning-intensive types. Unlike previous works, such as Chain-of-Thought, that solely enhance the reasoning abilities in LLMs, experimental results demonstrate that SPP effectively reduces factual hallucination, and maintains strong reasoning capabilities. Additionally, comparative experiments show that cognitive synergy only emerges in GPT-4 and does not appear in less capable models, such as GPT-3.5-turbo and Llama2-13b-chat, which draws an interesting analogy to human development. Code, data, and prompts can be found at: https://github.com/MikeWangWZHL/Solo-Performance-Prompting.git.

Zhenhailong Wang, Shaoguang Mao, Wenshan Wu, Tao Ge, Furu Wei, Heng Ji• 2023

Related benchmarks

Task	Dataset	Result
Code Generation	HumanEval	Pass@188.32	1048
Mathematical Reasoning	MATH	Accuracy51.7	882
Code Generation	HumanEval (test)	Pass@188.32	701
Code Generation	MBPP (test)	Pass@173.19	411
Code Generation	HumanEval+	--	393
Mathematical Reasoning	GSM8K	Accuracy (GSM8K)92.8	358
Mathematical Reasoning	MATH	Accuracy45.43	338
Multi-hop Question Answering	HotpotQA (test)	--	334
Arithmetic Reasoning	MultiArith	Accuracy97.49	324
Multitask Language Understanding	MMLU	--	263

Showing 10 of 53 rows

Other info

Follow for update

@wizwand_team Discord