Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Hallucination Detection on MMEvalPro perception
Loading...
98.6
F1 (Faithful)
Auxiliary Model
66.256
74.653
83.05
91.447
Dec 13, 2025
F1 (Faithful)
F1 (Unfaithful)
Updated 1mo ago
Evaluation Results
Method
Method
Links
F1 (Faithful)
F1 (Unfaithful)
Auxiliary Model
Evaluated Model=ThinkL...
2025.12
98.6
97.8
HaloScope
Evaluated Model=ThinkL...
2025.12
91.5
14.9
Prompting
Evaluated Model=ThinkL...
2025.12
84.8
30.8
SAPLMA
Evaluated Model=ThinkL...
2025.12
74.9
25.4
kNN
Evaluated Model=ThinkL...
2025.12
67.5
8.9
Feedback
Search any
task
Search any
task