Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

MMedbench

Benchmarks

Task NameDataset NameSOTA ResultTrend
Medical Knowledge EvaluationMMedbench English subset (val)
Accuracy60.33
36
Multilingual Medical Question AnsweringMMedBench (test)
Accuracy (Chinese)83.07
20
Medical Multi-choice QAMMedBench (test)
Token Accuracy92.55
16
Medical Multi-choice Question AnsweringMMedBench (test)
Token Perplexity (log)0.1494
16
Knowledge Boundary ExpressionMMedBench (test)
F1 Score69.9
15
Medical Question AnsweringMMedBench 1.0 (test)
Chinese Accuracy84.47
9
Showing 6 of 6 rows