ClinicalAgents: Multi-Agent Orchestration for Clinical Decision Making with Dual-Memory

About

While Large Language Models (LLMs) have demonstrated potential in healthcare, they often struggle with the complex, non-linear reasoning required for accurate clinical diagnosis. Existing methods typically rely on static, linear mappings from symptoms to diagnoses, failing to capture the iterative, hypothesis-driven reasoning inherent in human clinicians. To bridge this gap, we introduce ClinicalAgents, a novel multi-agent framework designed to simulate the cognitive workflow of expert clinicians. Unlike rigid sequential chains, ClinicalAgents employs a dynamic orchestration mechanism modeled as a Monte Carlo Tree Search (MCTS) process. This allows an orchestrator to iteratively generate hypotheses, actively verify evidence, and trigger backtracking when critical information is missing. The foundation of this framework is a Dual-Memory architecture: a mutable working memory that maintains the evolving patient state for context-aware reasoning, and a static experience memory that retrieves clinical guidelines and historical cases via an active feedback loop. Extensive experiments demonstrate that ClinicalAgents achieves the best performance among evaluated baselines, significantly enhancing both diagnostic accuracy and explainability compared to strong single-agent and multi-agent baselines. Our code is released at https://github.com/ZhuohanGe/ClinicalAgents-Code.

Zhuohan Ge, Haoyang Li, Yubo Wang, Nicole Hu, Chen Jason Zhang, Qing Li• 2026

Related benchmarks

Task	Dataset	Result
Medical Visual Question Answering	PathVQA (test)	Accuracy72.78	55
Question Answering	PubMedQA PQA-L (test)	Accuracy76.6	45
Clinical Decision-Making	MedChain (overall)	Specialty Referral Accuracy (Lv1)62.43	18
Medical Question Answering	MedBullets (test)	Accuracy82.79	18
Medical Question Answering	MedQA US (test)	Accuracy89.95	18

Showing 5 of 5 rows

Other info

Follow for update

@wizwand_team Discord