Share your thoughts, 1 month free Claude Pro on us
See more
Feedback
Search any
task
Search any
task
SOTA Uncertainty Estimation benchmarks and papers with code | Wizwand
Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Tasks
Uncertainty Estimation
Benchmarks
Dataset Name
SOTA Method
Dataset Name
SOTA Method
Metric
Trend
Results
Last Updated
TriviaQA
Geometry-Calibrated Conformal Abstention
AUROC
88
111
1mo ago
TriviaQA (test)
ACT-ViT
AUROC
87.91
110
2d ago
JudgeBench (test)
Energy
AUROC
71.53
77
3mo ago
CoQA
Logit Magnitude
AUROC
0.857
58
26d ago
SciQA
Eigenscore
AUROC
0.8269
56
2mo ago
emrQA
Logit Magnitude
AUROC
77.5
42
26d ago
NewsQA
Self-Consistency
AUROC
76.6
42
26d ago
CoQA (test)
SAR
AUROC
77.3
42
3mo ago
GSM8K
BSDETECTOR
AUROC
0.951
41
1mo ago
TruthfulQA
Closeness Centrality
AUROC
63.9
40
1mo ago
NaturalQA
SAR
AUROC
77
39
1mo ago
Real-Noise
Deep Ens.
Uncertainty Interval Length
0.021
36
8d ago
Secret-word taboo dataset (random-split)
Bootstrap
Accuracy
42.4
32
7d ago
NQ-open
SE+MCH
AUROC
75.88
32
1mo ago
Gaussian noise dataset
Deep Ens.
Uncertainty Interval Length
0.05
30
8d ago
Poisson noise dataset
QUTCC
Uncertainty Interval Length
0.035
30
8d ago
WebQA
SE + MARS
AUROC
73.57
30
3mo ago
OBQA
DeepEns
AUROC
88.03
24
1mo ago
RACE Llama-3.1-8B and Gemma-2-9B backbones (test)
ETN
AUROC
91.3
24
1mo ago
SimpleQA, MuSiQue, and TruthfulQA Average
Internal Confidence
AUROC
61
24
3mo ago
MuSiQue
Predictive Entropy
AUROC
65.6
24
3mo ago
SimpleQA
P(YES) (top right)
AUROC
61.3
24
3mo ago
TriviaQA, SVAMP, and NQ Average
G-NLL
AUROC
0.843
23
1mo ago
CommonsenseQA
Geometry-Calibrated Conformal Abstention
AUROC
0.74
18
1mo ago
SciQ
Geometry-Calibrated Conformal Abstention
AUROC
82
18
1mo ago
Showing 25 of 83 rows
25 / page
50 / page
100 / page
1
2
3
4
Search any
task
Search any
task
Privacy Policy
Terms of Service
FAQs
Swarm Docs