Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Error Detection on TruthfulQA 200-question subset

0.512AUROC

SE-NLI

0.500560.503530.50650.50947Mar 25, 2026
Updated 23d ago

Evaluation Results

MethodLinks
2026.03
0.5120.42111
2026.03
0.5110.41911
2026.03
0.5010.40411