Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Clinical Diagnosis on MedXpertQA
Loading...
7.52
EM
C-MIG
0.3128
2.1839
4.055
5.9261
May 27, 2026
EM
KG Score
Average Score
Updated 6d ago
Evaluation Results
Method
Method
Links
EM
KG Score
Average Score
C-MIG
Backbone=Qwen2.5-7B
2026.05
7.52
13.39
10.46
AutoRefine-HardDoc
Backbone=Qwen2.5-7B
2026.05
7.13
12.95
10.04
AutoRefine-ICDTree
Backbone=Qwen2.5-7B
2026.05
5.15
8.91
7.03
AutoRefine-HardSearch
Backbone=Qwen2.5-7B
2026.05
5.15
11.17
8.16
AutoRefine
Backbone=Qwen2.5-7B
2026.05
4.55
7.41
5.98
Search-R1
Backbone=Qwen2.5-7B
2026.05
3.76
5.98
4.87
IGPO
Backbone=Qwen2.5-7B
2026.05
3.76
7.22
5.74
AutoRefine-Embedding
Backbone=Qwen2.5-7B
2026.05
3.76
6.77
5.26
Qwen2.5-7B
Backbone=Qwen2.5-7B
2026.05
0.59
2.65
1.68
Feedback
Search any
task
Search any
task