Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Faithfulness Detection on LatentAudit Mistral-7B (evaluation)
Loading...
94
AUROC
GPT-4o Judge
70.08
76.29
82.5
88.71
Apr 7, 2026
AUROC
F1
Updated 1mo ago
Evaluation Results
Method
Method
Links
AUROC
F1
GPT-4o Judge
Latency (ms)=∼5,300
2026.04
94
87
LatentAudit
Latency (ms)=0.77 (+11...
2026.04
92.5
85.2
INSIDE
Latency (ms)=∼3.8
2026.04
89.5
82.5
SAPLMA
Latency (ms)=∼1.5
2026.04
87
80
SelfCheckGPT
Latency (ms)=∼28,500
2026.04
85.8
79
Min-Perplexity
Latency (ms)=0.0
2026.04
71
64.2
Feedback
Search any
task
Search any
task