Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Confidence Estimation on Open-Set

67.2Accuracy

GPT-4.1 + Verbalized Conf.

28.61638.63348.6558.667Nov 18, 2025
Updated 17d ago

Evaluation Results

MethodLinks
2025.11
67.267.928.827.7
2025.11
66.971.82110
2025.11
66.5---
2025.11
64.870.52418.6
2025.11
58.371.230.128.5
2025.11
57.6---
57.373.121.59.4
2025.11
55.867.128.424.2
2025.11
44.463.446.847.7
2025.11
43.562.651.251.9
43.468.130.427.4
2025.11
42.1---
32.468.133.333.1
2025.11
316453.957
2025.11
30.364.560.963
2025.11
30.1---