CMAD: Cooperative Multi-Agent Diffusion via Stochastic Optimal Control

About

Continuous-time generative models have achieved remarkable success in image restoration and synthesis. However, controlling the composition of multiple pre-trained models remains an open challenge. Current approaches largely treat composition as an algebraic composition of probability densities, such as via products or mixtures of experts. This perspective assumes the target distribution is known explicitly, which is almost never the case. In this work, we propose a different paradigm that formulates compositional generation as a cooperative Stochastic Optimal Control problem. Rather than combining probability densities, we treat pre-trained diffusion models as interacting agents whose diffusion trajectories are jointly steered, via optimal control, toward a shared objective defined on their aggregated output. We validate our framework on conditional MNIST generation and compare it against a na\"ive inference-time DPS-style baseline replacing learned cooperative control with per-step gradient guidance.

Riccardo Barbano, Alexander Denker, Zeljko Kereta, Runchang Li, Francisco Vargas• 2026

Related benchmarks

Task	Dataset	Result
Compositional generation	MNIST digit 0	Accuracy99.8	6
Compositional generation	MNIST digit 3	Classification Accuracy99.9	6
Compositional generation	MNIST digit 9	Accuracy99.32	6

Showing 3 of 3 rows

Other info

Follow for update

@wizwand_team Discord