Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Medical QA Benchmarks

Benchmarks

Task NameDataset NameSOTA ResultTrend
Multi-choice medical QAMulti-choice medical QA benchmarks (test)
MMLU-Med Accuracy70.7
28
Medical Question AnsweringMedical QA Benchmarks (MedQA, MedMCQA, MMLU*, CMB, CMExam, CMMLU*) (test)
MedQA Accuracy64.1
20
Showing 2 of 2 rows