Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Error Detection on TruthfulQA (full)

0.599AUROC

B1 mean entropy

0.539720.555110.57050.58589Mar 25, 2026
Updated 23d ago

Evaluation Results

MethodLinks
2026.03
0.5990.559-
2026.03
0.5880.5476
2026.03
0.5480.5111
2026.03
0.5420.51311