Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Medical Question Answering on PubMedQA (test)

12.2CUS Score

GPT-3.5-turbo (Baseline)

5.73127.41069.0910.7694Nov 20, 2025
Updated 18d ago

Evaluation Results

MethodLinks
2025.11
12.297.58
2025.11
8.0390.1
2025.11
7.1198.2
2025.11
5.9894.53