Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

LLM Judgement Confidence Estimation on HH-RLHF (test)

0.4763RK

Predictive Probability

0.27350.326150.37880.43145May 14, 2026
Updated 16d ago

Evaluation Results

MethodLinks
2026.05
0.47630.5259
2026.05
0.47380.523
2026.05
0.44780.551
2026.05
0.44090.554
2026.05
0.39860.5999
2026.05
0.38410.6154
2026.05
0.38030.6202
2026.05
0.3760.6369
2026.05
0.37510.6374
2026.05
0.36780.6473
2026.05
0.35970.6489
2026.05
0.35710.6537
2026.05
0.34820.6578
2026.05
0.32860.6805
2026.05
0.30940.6945
2026.05
0.28130.7126