Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Multiple Choice Question Answering on MMLU Medical and Biological Sub-tasks

95.8Clinical Knowledge Accuracy

GPT-4 (Medprompt)

21.64840.89960.1579.401Nov 16, 2023Dec 8, 2023Dec 31, 2023Jan 22, 2024Feb 14, 2024Mar 7, 2024Mar 30, 2024
Updated 3d ago

Evaluation Results

MethodLinks
2023.11
95.89889.695.297.989-
2023.11
919383.59694.387.6-
2023.11
88.79284.495.295.883.2-
2024.03
87.288.284.487.287.986.686.9
2023.11
86.4928093.895.176.9-
2024.03
86.4928093.893.876.387.1
2023.11
80.47563.783.888.976.3-
2023.11
77.77965.282.178.569.8-
2024.02
74.717465.9272.7972.9164.73-
2024.03
74.376.774.875.376.174.375.2
2024.03
71.674.863.277.370.865.270.5
2024.03
68.76860.769.972.963.667.3
2024.02
62.95755.659.462.557.2-
2024.02
62.862.757.563.564.355.7-
2024.02
62.564.755.862.764.856.3-
2024.02
62.36755.861.466.958-
2024.02
60.16558.560.560.456.5-
2024.03
59.96456.560.45954.764.6
2024.02
59.96456.560.45954.7-
2024.03
57.763.856.95657.148.956.7
2024.02
53.15854.158.858.148.6-
2024.02
51.45249.453.350.749.1-
2024.02
41.650.346.427.944.430.8-
2024.02
24.527.735.317.430.323.3-