Share your thoughts, 1 month free Claude Pro on us
See more
Feedback
Search any
task
Search any
task
SOTA Uncertainty Quantification benchmarks and papers with code | Wizwand
Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Tasks
Uncertainty Quantification
Benchmarks
Dataset Name
SOTA Method
Dataset Name
SOTA Method
Metric
Trend
Results
Last Updated
Average of 6 datasets
Dissimilarity + beamsearch
PRR
65
120
1mo ago
Musique 500 randomly sampled queries (test)
R2C
AUROC
0.8322
70
1mo ago
HotpotQA 500 randomly sampled queries (test)
R2C
AUROC
83.25
70
1mo ago
PopQA 500 randomly sampled queries (test)
R2C
AUROC
0.8709
70
1mo ago
ImageNet 10
lowrank-KFAC
NLL
0.266
42
5d ago
CIFAR10
lowrank-KFAC
NLL
0.256
42
5d ago
FashionMNIST
lowrank-KFAC
NLL
0.248
42
5d ago
Vision Datasets averaged (test)
SGPU
AUROC
81.7
36
1mo ago
LongFact
Ecc
PCC
-0.017
32
9d ago
BIO
Ecc
PCC
-0.129
32
9d ago
MulFactTrap (test)
RUfact
ROC AUC
0.898
32
1mo ago
Mixed Dataset (real and fake biographies)
RUgen
ROC AUC
0.9001
32
1mo ago
MAQA ∆K−1
Structure-Aware Minimum Bayes Risk Decoding
KL Divergence AUC
0.757
28
1mo ago
CNN/DailyMail
Structure-Aware Minimum Bayes Risk Decoding
Hamming AUC
0.745
28
1mo ago
WMT 19
KLE
COMET AUC
0.608
28
1mo ago
MAQA
Structure-Aware Minimum Bayes Risk Decoding
Hamming AUC
83.5
28
1mo ago
SciQ (test)
SENTSAR
AUROC
74.5
28
1mo ago
MSD Task01 (test)
ACQR
Coverage (%)
94.22
24
15d ago
ARC-E
ScalaBL
Training Memory (MB)
17,290
14
11d ago
CIFAR-10 (test)
MAP
Accuracy
93.5
14
1mo ago
PTB
LSTM
CU
417
12
1mo ago
MIT-BIH
LSTM
CU
998
12
1mo ago
MedQA (test)
SAR
AUROC
0.635
9
1mo ago
Countdown, GSM8K, MATH500, SVAMP Combined
DiSE
Average ROC-AUC
63.7
8
1mo ago
MATH500
LLaMA perplexity
ROC-AUC (Threshold 128)
65.2
8
1mo ago
Showing 25 of 66 rows
25 / page
50 / page
100 / page
1
2
3
Search any
task
Search any
task
Privacy Policy
Terms of Service
FAQs
Swarm Docs