Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

MedBrowseComp

Benchmarks

Task NameDataset NameSOTA ResultTrend
Question AnsweringMedBrowseComp
F1 Score23.2
9
Medical ReasoningMedBrowseComp
Accuracy (Before)35.1
4
medical browsing and synthesisMedBrowseComp
Avg@356.5
4
Medical Information Retrieval and ComparisonMedBrowseComp
Pass@129
4
Short-factMedBrowseComp
Accuracy8.4
2
Showing 5 of 5 rows