Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

medical QA

Benchmarks

Task NameDataset NameSOTA ResultTrend
Medical Question AnsweringOut-of-domain medical QA History, Engineering, Law (test)
History Accuracy50.4
10
Medical Question AnsweringMedical QA
GPT-4 Score92.5
9
Medical Question AnsweringMedical QA offline evaluation
Honesty0.83
3
Showing 3 of 3 rows