Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Error Prediction on TriviaQA (val)

81.3PRR

latent selective

-3.0294418.8637840.75762.65022Jun 1, 2026
Updated 1d ago

Evaluation Results

MethodLinks
2026.06
81.3
2026.06
79.5
2026.06
79.3
2026.06
78.8
2026.06
78.7
2026.06
78.6
2026.06
78.4
2026.06
77.8
2026.06
77.8
2026.06
77.3
2026.06
77.3
2026.06
77.1
2026.06
76.1
2026.06
75.9
2026.06
75.8
2026.06
75.7
2026.06
75.4
2026.06
75.3
2026.06
75.2
2026.06
74.8
2026.06
74.2
2026.06
73.9
2026.06
73.8
2026.06
73.8
2026.06
73.5
2026.06
72.9
2026.06
72.5
2026.06
67.3
2026.06
67.1
2026.06
64.1
2026.06
56.9
2026.06
56.8
2026.06
20.4
2026.06
19.7
2026.06
19.6
0.858
2026.06
0.854
0.85
2026.06
0.849
2026.06
0.848
0.836
2026.06
0.831
0.813
2026.06
0.787
0.773
0.763
2026.06
0.747
2026.06
0.742
0.482
2026.06
0.48
2026.06
0.214