Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Clinical Question Answering on HealthBench Hard Set
Loading...
0.3861
Overall Score
Oph-Guid-Rag
0.293332
0.317416
0.3415
0.365584
Mar 23, 2026
Overall Score
Accuracy
Completeness
Instruction Following
Context Awareness
Communication Quality
Updated 2mo ago
Evaluation Results
Method
Method
Links
Overall Score
Accuracy
Completeness
Instruction Following
Context Awareness
Communication Quality
Oph-Guid-Rag
2026.03
0.3861
65.76
4.83
13.33
39.69
91.67
GPT-5.2
2026.03
0.2969
59.56
9.71
13.33
34.19
61.67
Feedback
Search any
task
Search any
task