Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Hallucination Prediction on HotpotQA + type
Loading...
75.51
AUROC
Conf + Probe (SCAO)
68.5524
70.3587
72.165
73.9713
Sep 18, 2025
AUROC
A(ϕ(s_M))
Updated 1mo ago
Evaluation Results
Method
Method
Links
AUROC
A(ϕ(s_M))
Conf + Probe (SCAO)
Model=LLaMA-8B
2025.09
75.51
20.14
Conf + Probe
Model=LLaMA-8B
2025.09
73.87
18.5
Conf (SCAO)
Model=LLaMA-8B
2025.09
73.81
-
Conf + Probe (SCAO)
Model=LLaMA-70B
2025.09
73.52
18.96
Conf (SCAO)
Model=LLaMA-70B
2025.09
73.42
-
Probednn
Model=LLaMA-8B
2025.09
73.17
17.8
Conf + Probe
Model=LLaMA-70B
2025.09
73.06
18.5
Conf
Model=LLaMA-70B
2025.09
72.87
-
Probednn
Model=LLaMA-70B
2025.09
69.42
14.86
Conf
Model=LLaMA-8B
2025.09
68.82
-
Feedback
Search any
task
Search any
task