Multi-Modal Multi-Agent Reinforcement Learning for Radiology Report Generation
About
We propose MARL-Rad, a multi-modal multi-agent reinforcement learning framework for radiology report generation that trains the entire agentic system on policy within its deployed radiology workflow. MARL-Rad addresses the limitation of post-hoc agentization, where fixed LLMs are organized into hand-designed agentic workflows without being optimized for their assigned roles. Our framework decomposes chest X-ray interpretation into region-specific agents and a global integrating agent, and jointly optimizes them using clinically verifiable rewards. Experiments on the MIMIC-CXR and IU X-ray datasets show that MARL-Rad consistently improves clinical efficacy metrics such as RadGraph, CheXbert, and GREEN scores, achieving state-of-the-art clinical efficacy performance. Further analyses show that MARL-Rad improves laterality consistency and produces more accurate and detailed reports. A blinded clinician evaluation further suggests that MARL-Rad produces reports clinically comparable to ground-truth reports.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Radiology Report Generation | MIMIC-CXR findings | BLEU-45.6 | 26 | |
| Radiology Report Generation | IU X-ray Findings | BLEU-44.6 | 21 | |
| Radiology Report Generation | MIMIC-CXR Findings + Impression | BLEU-414.2 | 6 | |
| Radiology Report Generation | IU X-ray Findings + Impression | BLEU-40.182 | 3 |