Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Confidence Estimation on SummEval
Loading...
0.717
AUROC
VERDI LR ensemble
0.35716
0.45058
0.544
0.63742
May 11, 2026
AUROC
Updated 21d ago
Evaluation Results
Method
Method
Links
AUROC
VERDI LR ensemble
Calls=1×
2026.05
0.717
Best single signal
Calls=1×
2026.05
0.675
CoT-UQ
Calls=5×
2026.05
0.652
Trace length
Calls=1×
2026.05
0.561
Logprob confidence
Calls=1×
2026.05
0.371
Feedback
Search any
task
Search any
task