Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

MedAgentsBench

Benchmarks

Task NameDataset NameSOTA ResultTrend
Medical Question AnsweringMedAgentsBench
MedBullets Score59.55
18
Medical ReasoningMedAgentsBench Hard Subsets
MEDQA0.52
12
Showing 2 of 2 rows