Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Medical Reasoning on MedQA (EM)
Loading...
61.1
EM
GraphWalker
51.7504
54.1777
56.605
59.0323
Apr 8, 2026
EM
Updated 9d ago
Evaluation Results
Method
Method
Links
EM
GraphWalker
Backbone LLM=Qwen3-14B
2026.04
61.1
Semantic-emb
Backbone LLM=Qwen3-14B
2026.04
59.58
CONE
Backbone LLM=Qwen3-14B
2026.04
59.17
Zero-shot
Backbone LLM=Qwen3-14B
2026.04
56.56
SPELL
Backbone LLM=Qwen3-14B
2026.04
54.1
LMS3
Backbone LLM=Qwen3-14B
2026.04
54.09
Delta-KNN
Backbone LLM=Qwen3-14B
2026.04
53.47
Influence
Backbone LLM=Qwen3-14B
2026.04
53.02
Random
Backbone LLM=Qwen3-14B
2026.04
52.96
GradSel
Backbone LLM=Qwen3-14B
2026.04
52.93
IDS
Backbone LLM=Qwen3-14B
2026.04
52.59
Time-series
Backbone LLM=Qwen3-14B
2026.04
52.11
Feedback
Search any
task
Search any
task