Latent Collaboration in Multi-Agent Systems

About

Multi-agent systems (MAS) extend large language models (LLMs) from independent single-model reasoning to coordinative system-level intelligence. While existing LLM agents depend on text-based mediation for reasoning and communication, we take a step forward by enabling models to collaborate directly within the continuous latent space. We introduce LatentMAS, an end-to-end training-free framework that enables pure latent collaboration among LLM agents. In LatentMAS, each agent first performs auto-regressive latent thoughts generation through last-layer hidden embeddings instead of text. Then, a shared latent working memory preserves and transfers each agent's internal representations and latent thoughts, ensuring lossless information exchange without re-encoding. We provide detailed theoretical analyses showing that LatentMAS achieves higher expressiveness and lossless information preservation with lower overall complexity than standard text-based MAS. In addition, empirical evaluations across 9 comprehensive benchmarks spanning math and science reasoning, commonsense understanding, and code generation show that LatentMAS outperforms advanced single agents and text-based MAS baselines, achieving up to 14.6% higher accuracy, reducing output token usage by 70.8%-83.7%, and providing 4$\times$-4.3$\times$ faster end-to-end inference. Code and data are fully open-sourced at https://github.com/Gen-Verse/LatentMAS.

Jiaru Zou, Ruizhong Qiu, Gaotang Li, Xiyuan Yang, Katherine Tieu, Pan Lu, Ke Shen, Hanghang Tong, Yejin Choi, Jingrui He, James Zou, Mengdi Wang, Ling Yang• 2025

Related benchmarks

Task	Dataset	Result
Code Generation	HumanEval+	--	393
Code Generation	MBPP+	Accuracy75.7	236
Mathematical Problem Solving	MATH	Accuracy78.6	229
Long Video Understanding	LVBench	Accuracy33.2	218
Math Reasoning	AQUA	Accuracy87.01	188
Medical Question Answering	MedQA	Accuracy81.2	153
Code Generation	HumanEval+ (test)	--	132
Math Word Problem Solving	GSM8K	Accuracy95.2	111
Mathematical Reasoning	GSM8K	Accuracy95.2	108
Code Generation	HumanEval-ET	--	108

Showing 10 of 58 rows

Other info

GitHub

Follow for update

@wizwand_team Discord