Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Med42-v2: A Suite of Clinical LLMs

About

Med42-v2 introduces a suite of clinical large language models (LLMs) designed to address the limitations of generic models in healthcare settings. These models are built on Llama3 architecture and fine-tuned using specialized clinical data. They underwent multi-stage preference alignment to effectively respond to natural prompts. While generic models are often preference-aligned to avoid answering clinical queries as a precaution, Med42-v2 is specifically trained to overcome this limitation, enabling its use in clinical settings. Med42-v2 models demonstrate superior performance compared to the original Llama3 models in both 8B and 70B parameter configurations and GPT-4 across various medical benchmarks. These LLMs are developed to understand clinical queries, perform reasoning tasks, and provide valuable assistance in clinical environments. The models are now publicly available at \href{https://huggingface.co/m42-health}{https://huggingface.co/m42-health}.

Cl\'ement Christophe, Praveen K Kanithi, Tathagata Raha, Shadab Khan, Marco AF Pimentel• 2024

Related benchmarks

TaskDatasetResultRank
Medical Question AnsweringMedMCQA
Accuracy62.28
253
Medical Question AnsweringMedQA
Accuracy59.78
109
Medical Question AnsweringPubMedQA
Accuracy78.1
45
Medical order extractionSIMORD (test)
Match Count65.2
22
Clinical Diagnostic ReasoningClinical Diagnostic Reasoning Benchmark 1.0 (test)
ICD Recall27.87
13
Biomedical Question AnsweringFour biomedical QA datasets macro-averaged (test)
Faithfulness85.3
4
Showing 6 of 6 rows

Other info

Follow for update