Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Question Answering on MMLU Clinical Knowledge (test)
Loading...
63.21
Accuracy
PubMed Reasoner
58.062
59.3985
60.735
62.0715
Mar 28, 2026
Accuracy
Updated 2mo ago
Evaluation Results
Method
Method
Links
Accuracy
PubMed Reasoner
Backbone Model=GPT
2026.03
63.21
PubMed Reasoner
Backbone Model=Gemini
2026.03
61.36
Self-reflection
Backbone Model=GPT
2026.03
60.79
Self-reflection
Backbone Model=Gemini
2026.03
60.67
GPT4o
Retrieval/Agent Strate...
2026.03
60.64
Gemini-2.5 Pro
Retrieval/Agent Strate...
2026.03
60.52
RAG method
Backbone Model=Gemini
2026.03
58.94
RAG method
Backbone Model=GPT
2026.03
58.26
Feedback
Search any
task
Search any
task