Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Faithfulness Detection on LatentAudit Llama-3-8B (evaluation set)

0.948AUROC

GPT-4o Judge

0.712960.773980.8350.89602Apr 7, 2026
Updated 1mo ago

Evaluation Results

MethodLinks
2026.04
0.9480.881
2026.04
0.9420.869
2026.04
0.9080.841
2026.04
0.8820.815
2026.04
0.8710.804
2026.04
0.7220.655