Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

MMedbench

Benchmarks

Task NameDataset NameSOTA ResultTrend
Medical Knowledge EvaluationMMedbench English subset (val)
Accuracy60.33
36
Multilingual Medical Question AnsweringMMedBench (test)
Accuracy (Chinese)83.07
20
Medical Multi-choice QAMMedBench (test)
Token Accuracy92.55
16
Medical Multi-choice Question AnsweringMMedBench (test)
Token Perplexity (log)0.1494
16
Knowledge Boundary ExpressionMMedBench (test)
F1 Score69.9
15
Medical Question AnsweringMMedBench 1.0 (test)
Chinese Accuracy84.47
9
Showing 6 of 6 rows