Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Confidence Estimation on MediTOD
Loading...
68.7
AUROC
MedConf
55.492
58.921
62.35
65.779
Jan 22, 2026
AUROC
Pearson Correlation
Updated 4d ago
Evaluation Results
Method
Method
Links
AUROC
Pearson Correlation
MedConf
Type=Self-Verbalized,...
2026.01
68.7
0.943
MedConf
Model=GPT-4.1, Type=Se...
2026.01
67.2
0.981
CE
Type=Self-Verbalized,...
2026.01
63.6
0.895
SemSim (BERT)
Type=Consistency-Level...
2026.01
62.8
0.491
CE
Model=GPT-4.1, Type=Se...
2026.01
62.8
0.981
SemSim (BERT)
Model=GPT-4.1, Type=Co...
2026.01
59.3
0.829
Conditional PMI
Type=Token-Level, Mode...
2026.01
56
0.734
Feedback
Search any
task
Search any
task