ClinicalAgents: Multi-Agent Orchestration for Clinical Decision Making with Dual-Memory
About
While Large Language Models (LLMs) have demonstrated potential in healthcare, they often struggle with the complex, non-linear reasoning required for accurate clinical diagnosis. Existing methods typically rely on static, linear mappings from symptoms to diagnoses, failing to capture the iterative, hypothesis-driven reasoning inherent to human clinicians. To bridge this gap, we introduce ClinicalAgents, a novel multi-agent framework designed to simulate the cognitive workflow of expert clinicians. Unlike rigid sequential chains, ClinicalAgents employs a dynamic orchestration mechanism modeled as a Monte Carlo Tree Search (MCTS) process. This allows an Orchestrator to iteratively generate hypotheses, actively verify evidence, and trigger backtracking when critical information is missing. Central to this framework is a Dual-Memory architecture: a mutable Working Memory that maintains the evolving patient state for context-aware reasoning, and a static Experience Memory that retrieves clinical guidelines and historical cases via an active feedback loop. Extensive experiments demonstrate that ClinicalAgents achieves state-of-the-art performance, significantly enhancing both diagnostic accuracy and explainability compared to strong single-agent and multi-agent baselines.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Medical Visual Question Answering | PathVQA (test) | Accuracy72.78 | 55 | |
| Question Answering | PubMedQA PQA-L (test) | Accuracy76.6 | 43 | |
| Clinical Decision-Making | MedChain (overall) | Specialty Referral Accuracy (Lv1)62.43 | 18 | |
| Medical Question Answering | MedBullets (test) | Accuracy82.79 | 18 | |
| Medical Question Answering | MedQA US (test) | Accuracy89.95 | 18 |