Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Multi-choice Medical QA on Multi-choice medical QA benchmarks (test)

70.7MMLU-Med Accuracy

MedLA

18.38831.96945.5559.131Sep 28, 2025
Updated 1mo ago

Evaluation Results

MethodLinks
2025.09
70.762.676.569.9
2025.09
68.154.970.664.5
2025.09
67.756.368.764.2
2025.09
65.155.264.261.5
2025.09
6553.46460.8
2025.09
64.353.264.160.5
2025.09
63.638.362.254.7
2025.09
63.447.764.458.5
2025.09
63.456.665.461.8
2025.09
63.447.465.158.6
2025.09
62.551.663.859.3
2025.09
60.339.948.549.6
2025.09
60.246.865.257.4
2025.09
6040.149.349.8
2025.09
57.948.771.959.5
2025.09
57.139.564.653.7
2025.09
50.240.864.651.9
2025.09
45.236.250.343.9
2025.09
44.225.334.634.7
2025.09
41.535.436.337.7
2025.09
37.628.156.840.8
2025.09
32.23859.443.2
2025.09
31.947.570.650
2025.09
31.825.154.737.2
2025.09
31.74770.749.8
2025.09
28.842.570.647.3
2025.09
20.724.734.626.7
2025.09
20.420.820.820.7