Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Selective Prediction on TriviaQA 200 samples (test)

62.5Rejection Accuracy (80%)

Semantic Entropy

33.06840.70948.3555.991Mar 30, 2026
Updated 18d ago

Evaluation Results

MethodLinks
2026.03
62.558.357.957.50.1760.67
2026.03
62.558.358.957.50.1770.661
2026.03
59.358.357.857.50.1740.666
2026.03
59.358.357.857.50.1740.683
2026.03
57.557.557.557.50.1720.586
2026.03
56.258.360.557.50.1750.475
2026.03
39.441.646.837.50.1230.687
2026.03
39.441.643.737.50.1210.673
2026.03
39.441.643.737.50.1210.772
2026.03
37.540.540.537.50.1170.54
2026.03
36.838.843.737.50.1160.716
2026.03
34.233.334.337.50.1030.416