Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Uncertainty Estimation on QASPER
Loading...
0.722
AUROC
Total
0.41104
0.49177
0.5725
0.65323
Apr 18, 2026
AUROC
Updated 1mo ago
Evaluation Results
Method
Method
Links
AUROC
Total
Backbone=Mistral-7B
2026.04
0.722
Aleatoric
Backbone=Mistral-7B
2026.04
0.623
SelfCheckGPT
Backbone=Mistral-7B
2026.04
0.619
SC + VC
Backbone=Mistral-7B
2026.04
0.581
Closeness Centrality
Backbone=Mistral-7B
2026.04
0.579
Kernel Lang. Ent.
Backbone=Mistral-7B
2026.04
0.573
SC Based VC
Backbone=Mistral-7B
2026.04
0.568
SC Score
Backbone=Mistral-7B
2026.04
0.567
SemanticEntropy
Backbone=Mistral-7B
2026.04
0.514
Self Certainty
Backbone=Mistral-7B
2026.04
0.484
Mean Token Entropy
Backbone=Mistral-7B
2026.04
0.481
Token Entropy
Backbone=Mistral-7B
2026.04
0.481
Max Sequence Prob.
Backbone=Mistral-7B
2026.04
0.48
Max Token Prob.
Backbone=Mistral-7B
2026.04
0.468
Perplexity
Backbone=Mistral-7B
2026.04
0.464
PTrue
Backbone=Mistral-7B
2026.04
0.423
Feedback
Search any
task
Search any
task