Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Step-level hallucination detection on ProcessBench
Loading...
91
AUROC
Teacher
49.608
60.354
71.1
81.846
May 13, 2026
AUROC
Updated 20d ago
Evaluation Results
Method
Method
Links
AUROC
Teacher
Deployability=non-depl...
2026.05
91
Student
Deployability=deployable
2026.05
75
Linear Probe
2026.05
67.8
LLM-Check
Mechanism=attention
2026.05
61.9
TL-Entropy
2026.05
57.1
TL-Perplexity
2026.05
51.2
Feedback
Search any
task
Search any
task