Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Medical Question Answering on CV-MedQA ambiguous (test)
Loading...
0.7643
Accuracy
AU-Probe
0.347676
0.455838
0.564
0.672162
Jan 24, 2026
Accuracy
Improvement (%)
Updated 4d ago
Evaluation Results
Method
Method
Links
Accuracy
Improvement (%)
AU-Probe
Model=Bio-Medical-Llam...
2026.01
0.7643
15.63
ASK4CONF
Model=Bio-Medical-Llam...
2026.01
0.6866
-
No Clarification
Model=Bio-Medical-Llam...
2026.01
0.608
-
AU-Probe
Model=Qwen2.5-7B, Clar...
2026.01
0.5939
10.84
AU-Probe
Model=Llama-3.1-8B, Cl...
2026.01
0.5719
9.03
ASK4CONF
Model=Llama-3.1-8B, Cl...
2026.01
0.5232
-
ASK4CONF
Model=Qwen2.5-7B, Clar...
2026.01
0.5169
-
No Clarification
Model=Qwen2.5-7B, Clar...
2026.01
0.4855
-
No Clarification
Model=Llama-3.1-8B, Cl...
2026.01
0.4815
-
AU-Probe
Model=BioMistral-7B, C...
2026.01
0.4116
4.79
ASK4CONF
Model=BioMistral-7B, C...
2026.01
0.3841
-
No Clarification
Model=BioMistral-7B, C...
2026.01
0.3637
-
Feedback
Search any
task
Search any
task