Our new X account is live! Follow @wizwand_team for updates
Search any
task
Feedback
Search any
task
SOTA Medical Reasoning benchmarks and papers with code | Wizwand
Our new X account is live! Follow @wizwand_team for updates
Home
/
Tasks
Medical Reasoning
Benchmarks
Dataset Name
SOTA Method
Dataset Name
SOTA Method
Metric
Trend
Results
Last Updated
XMEMRs
RE-MCDF
Recall
42.33
22
3d ago
NEEMRs
RE-MCDF
Recall
46.13
22
3d ago
Medical-O1-Reasoning-SFT
LLM-AutoDP
Wins
1
12
3d ago
Medical-O1-Reasoning-SFT (test)
LLM-AutoDP
Wins
0.5127
12
3d ago
MedAgentsBench Hard Subsets
MEDCOG-META
MEDQA
0.52
12
4d ago
MedQA
Med-Gemini
Accuracy
91.1
10
2d ago
DeepChestVQA
3DMedAgent
Accuracy (Attenuation Pattern)
62
9
4d ago
DeepTumorVQA refined 2025b
3DMedAgent (GPT-5)
Fatty Liver Accuracy
0.77
9
4d ago
DDXPlus
Gemini-2.5-pro
Performance Score
81.1
8
4d ago
SuperGPQA Med
RLVR
Accuracy
0.3831
8
4d ago
MedXpert
MedVerse
Accuracy
19.3
7
4d ago
MedBullets 5-option multiple choice
Huatuo-o1
Accuracy
53.9
7
4d ago
MedBullets 4-option multiple choice
MedVerse
Accuracy
62.3
7
4d ago
Humanity's Last Exam (HLE) Medical
MedReason
Accuracy
20.8
7
4d ago
HLE-Med
Gemini 3.0 Pro
Pass@1
36.91
4
4d ago
HealthBench
Baichuan-M3
HealthBench Score
66.2
4
4d ago
Medical task
SDFT
Accuracy
43.7
3
2d ago
ICD-Bench Difficulty Level 5 1.0
OURS-14B
Accuracy
59.05
2
4d ago
ICD-Bench Difficulty Level 4 1.0
OURS-14B
Accuracy
71.5
2
4d ago
ICD-Bench Difficulty Level 3 1.0
OURS-14B
Accuracy
80.33
2
4d ago
ICD-Bench Difficulty Level 2 1.0
OURS-14B
Accuracy
85.63
2
4d ago
ICD-Bench Difficulty Level 1 1.0
QWQ-MED-3 32B
Accuracy
0.9675
2
4d ago
MedQuAD complete benchmark
Meddollina
Failure Rate
0
1
4d ago
Showing 23 of 23 rows
25 / page
50 / page
100 / page
1
Search any
task
Search any
task
Terms of Service
FAQs