Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

What Disease does this Patient Have? A Large-scale Open Domain Question Answering Dataset from Medical Exams

About

Open domain question answering (OpenQA) tasks have been recently attracting more and more attention from the natural language processing (NLP) community. In this work, we present the first free-form multiple-choice OpenQA dataset for solving medical problems, MedQA, collected from the professional medical board exams. It covers three languages: English, simplified Chinese, and traditional Chinese, and contains 12,723, 34,251, and 14,123 questions for the three languages, respectively. We implement both rule-based and popular neural methods by sequentially combining a document retriever and a machine comprehension model. Through experiments, we find that even the current best method can only achieve 36.7\%, 42.0\%, and 70.1\% of test accuracy on the English, traditional Chinese, and simplified Chinese questions, respectively. We expect MedQA to present great challenges to existing OpenQA systems and hope that it can serve as a platform to promote much stronger OpenQA models from the NLP community in the future.

Di Jin, Eileen Pan, Nassim Oufattole, Wei-Hung Weng, Hanyi Fang, Peter Szolovits• 2020

Related benchmarks

TaskDatasetResultRank
Ranking Consistency AnalysisMMLU-Pro health Human aging
Spearman Correlation0.62
8
Physical chemistryChemBench Physical Chemistry
Spearman Correlation0.63
8
Ranking Consistency AnalysisMMLU-Pro health Virology
Spearman Correlation0.44
8
Ranking Consistency AnalysisMMLU-Pro Medical genetics health
Spearman Correlation0.35
8
Technical chemistryChemBench Technical Chemistry
Spearman Correlation0.72
8
Analytical chemistryChemBench Analytical Chemistry
Spearman Correlation0.31
8
Inorganic chemistryChemBench Inorganic Chemistry
Spearman Correlation0.42
8
Material scienceChemBench Material science
Spearman Correlation0.01
8
Ranking Consistency AnalysisMMLU-Pro Nutrition health
Spearman Correlation0.45
8
Ranking Consistency AnalysisMMLU-Pro Anatomy health
Spearman Correlation-0.19
8
Showing 10 of 12 rows

Other info

Follow for update