Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Uncertainty Estimation on TriviaQA, SVAMP, and NQ Average

0.843AUROC

G-NLL

0.550760.626630.70250.77837Dec 19, 2024
Updated 1mo ago

Evaluation Results

MethodLinks
2024.12
0.843
2024.12
0.838
2024.12
0.838
2024.12
0.824
2024.12
0.82
2024.12
0.804
2024.12
0.795
2024.12
0.793
2024.12
0.792
2024.12
0.776
2024.12
0.775
2024.12
0.728
2024.12
0.726
2024.12
0.723
2024.12
0.722
2024.12
0.719
2024.12
0.699
2024.12
0.649
2024.12
0.649
2024.12
0.615
2024.12
0.615
2024.12
0.612
2024.12
0.562