Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Medchain: Bridging the Gap Between LLM Agents and Clinical Practice with Interactive Sequence

About

Clinical decision making (CDM) is a complex, dynamic process crucial to healthcare delivery, yet it remains a significant challenge for artificial intelligence systems. While Large Language Model (LLM)-based agents have been tested on general medical knowledge using licensing exams and knowledge question-answering tasks, their performance in the CDM in real-world scenarios is limited due to the lack of comprehensive testing datasets that mirror actual medical practice. To address this gap, we present MedChain, a dataset of 12,163 clinical cases that covers five key stages of clinical workflow. MedChain distinguishes itself from existing benchmarks with three key features of real-world clinical practice: personalization, interactivity, and sequentiality. Further, to tackle real-world CDM challenges, we also propose MedChain-Agent, an AI system that integrates a feedback mechanism and a MCase-RAG module to learn from previous cases and adapt its responses. MedChain-Agent demonstrates remarkable adaptability in gathering information dynamically and handling sequential clinical tasks, significantly outperforming existing approaches.

Jie Liu, Wenxuan Wang, Zizhan Ma, Guolin Huang, Yihang SU, Kao-Jung Chang, Wenting Chen, Haoliang Li, Linlin Shen, Michael Lyu• 2024

Related benchmarks

TaskDatasetResultRank
Medical Visual Question AnsweringPathVQA (test)
Accuracy73.56
55
Question AnsweringPubMedQA PQA-L (test)
Accuracy74.2
43
Medical Question AnsweringMedQA US (test)
Accuracy90.02
18
Clinical Decision-MakingMedChain (overall)
Specialty Referral Accuracy (Lv1)59.37
18
Medical Question AnsweringMedBullets (test)
Accuracy81.82
18
Showing 5 of 5 rows

Other info

Follow for update