Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Faithfulness Detection on LatentAudit Qwen-2.5-7B (evaluation)

94.5AUROC

GPT-4o Judge

70.89277.02183.1589.279Apr 7, 2026
Updated 1mo ago

Evaluation Results

MethodLinks
2026.04
94.587.6
2026.04
93.886.2
2026.04
90.183.2
2026.04
87.680.8
2026.04
86.579.8
2026.04
71.865