Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Uncertainty Quantification on MedQA (test)
Loading...
0.635
AUROC
SAR
0.55908
0.57879
0.5985
0.61821
Jul 3, 2023
AUROC
Updated 3d ago
Evaluation Results
Method
Method
Links
AUROC
SAR
Backbone=WizardLM-13b
2023.07
0.635
SE
Backbone=WizardLM-13b
2023.07
0.62
SAR
Backbone=LLaMA-2-13b-chat
2023.07
0.616
SE
Backbone=LLaMA-2-13b-chat
2023.07
0.609
LN-PE
Backbone=WizardLM-13b
2023.07
0.609
SE
Backbone=Vicuna-13b
2023.07
0.599
SAR
Backbone=Vicuna-13b
2023.07
0.598
LN-PE
Backbone=Vicuna-13b
2023.07
0.572
LN-PE
Backbone=LLaMA-2-13b-chat
2023.07
0.562
Feedback
Search any
task
Search any
task