Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Medical Reasoning on MedMCQA
Loading...
86
Accuracy
ReConcile
36.6
49.425
62.25
75.075
Aug 11, 2025
Accuracy
Updated 18d ago
Evaluation Results
Method
Method
Links
Accuracy
ReConcile
Base Model=GPT-4o
2025.08
86
MedAgents
Base Model=GPT-4o
2025.08
85.7
TMA-AllCompon
Base Model=GPT-4o
2025.08
85.4
DyLAN
Base Model=GPT-4o
2025.08
84.6
MDAgents
Base Model=GPT-4o
2025.08
80.8
TMA-AllCompon
Base Model=MedGemma-4B
2025.08
63.6
TMA-AllCompon
Base Model=Gemma-3-4B
2025.08
53.3
DyLAN
Base Model=Gemma-3-4B
2025.08
52.4
ReConcile
Base Model=Gemma-3-4B
2025.08
52
MedAgents
Base Model=Gemma-3-4B
2025.08
50.8
MDAgents
Base Model=Gemma-3-4B
2025.08
38.5
Feedback
Search any
task
Search any
task