Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

MedAgents: Large Language Models as Collaborators for Zero-shot Medical Reasoning

About

Large language models (LLMs), despite their remarkable progress across various general domains, encounter significant barriers in medicine and healthcare. This field faces unique challenges such as domain-specific terminologies and reasoning over specialized knowledge. To address these issues, we propose MedAgents, a novel multi-disciplinary collaboration framework for the medical domain. MedAgents leverages LLM-based agents in a role-playing setting that participate in a collaborative multi-round discussion, thereby enhancing LLM proficiency and reasoning capabilities. This training-free framework encompasses five critical steps: gathering domain experts, proposing individual analyses, summarising these analyses into a report, iterating over discussions until a consensus is reached, and ultimately making a decision. Our work focuses on the zero-shot setting, which is applicable in real-world scenarios. Experimental results on nine datasets (MedQA, MedMCQA, PubMedQA, and six subtasks from MMLU) establish that our proposed MedAgents framework excels at mining and harnessing the medical expertise within LLMs, as well as extending its reasoning abilities. Our code can be found at https://github.com/gersteinlab/MedAgents.

Xiangru Tang, Anni Zou, Zhuosheng Zhang, Ziming Li, Yilun Zhao, Xingyao Zhang, Arman Cohan, Mark Gerstein• 2023

Related benchmarks

TaskDatasetResultRank
Medical Question AnsweringMedMCQA
Accuracy74.8
253
Question AnsweringPubMedQA
Accuracy76.8
145
Medical Question AnsweringMedMCQA (test)
Accuracy74.8
134
Question AnsweringMedQA
Accuracy83.7
70
Question AnsweringMedQA (test)
Accuracy83.7
61
Multiple-choice Question AnsweringMMLU Medical and Biological Sub-tasks
Clinical Knowledge Accuracy91
24
PCOS detectionKerala Dataset (public)
Accuracy87.16
18
PCOS detectionGED Dataset (private)
Accuracy84.71
18
Drug-Food Interaction Violation EvaluationDrug-Food Interaction Evaluation Set Warfarin, Potassium-sparing diuretic, and Statin cohorts (high-risk post-stroke cohorts)
Cohort Size (N)157
13
Nutrition care documentation quality assessmentNutrition care documentation for 330 stroke survivors 1.0 (test)
Diagnosis Score0.00e+0
13
Showing 10 of 13 rows

Other info

Code

Follow for update