Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Step-level Hallucination Detection on TruthfulQA
Loading...
0.965
AUROC
Student
0.63116
0.71783
0.8045
0.89117
May 13, 2026
AUROC
Updated 20d ago
Evaluation Results
Method
Method
Links
AUROC
Student
Deployability=deployable
2026.05
0.965
Teacher
Deployability=non-depl...
2026.05
0.96
Linear Probe
2026.05
0.9
LLM-Check
Mechanism=attention
2026.05
0.698
TL-Perplexity
2026.05
0.671
TL-Entropy
2026.05
0.644
Feedback
Search any
task
Search any
task