Share your thoughts, 1 month free Claude Pro on usSee more

Confidence Estimation on Internal Verification Suite GPT-4.1-mini (test)

0.912Q1 Attribution Present (AUROC)

Logprob

Updated 2mo ago

Evaluation Results

Method	Links
Logprob 2026.05		0.912	0.716	0.706	0.84	0.905	0.995	0.602
Verbalized 2026.05		0.851	0.632	-	0.61	-	-	0.735
VERDI CV 2026.05		0.806	0.825	0.915	0.877	0.905	1	0.459