Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Medical Question Answering on HealthBench (All Set)

58.56Overall Score

GPT-5.4

54.691255.695656.757.7044Mar 23, 2026
Updated 2mo ago

Evaluation Results

MethodLinks
2026.03
58.5663.3653.1447.4963.3355.62
2026.03
55.5963.1750.151.9350.2744.7
55.2462.6646.1434.7462.1241.53
2026.03
54.8456.945.7456.7253.1747.17