Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Hallucination Detection on PHOENIX 2014T
Loading...
94
AUC
META
58.12
67.435
76.75
86.065
Oct 21, 2025
AUC
AP
Accuracy
Updated 1d ago
Evaluation Results
Method
Method
Links
AUC
AP
Accuracy
META
Model Type=Gloss-free...
2025.10
94
99.2
87.2
Confidence
Model Type=Gloss-free...
2025.10
92.6
98.9
86
Grounding
Model Type=Gloss-free...
2025.10
91.8
98.6
93.8
META
Model Type=Gloss-based...
2025.10
89.9
97.8
86.4
Perplexity
Model Type=Gloss-free...
2025.10
88.8
97.9
71.3
Token entropy
Model Type=Gloss-free...
2025.10
87.1
97.5
73.1
Grounding
Model Type=Gloss-based...
2025.10
82.7
95.4
89.9
Token entropy
Model Type=Gloss-based...
2025.10
63.9
90
58.1
Confidence
Model Type=Gloss-based...
2025.10
61.8
88.7
64.5
Perplexity
Model Type=Gloss-based...
2025.10
59.5
89.2
56.1
Feedback
Search any
task
Search any
task