Share your thoughts, 1 month free Claude Pro on usSee more

Reasoning on Natural

70.96Accuracy

HALLUGUARD

Updated 1mo ago

Evaluation Results

Method	Links
HALLUGUARD 2026.01		70.96
Energy 2026.01		68.59
MIND 2026.01		68.32
P(true) 2026.01		68.16
Semantic Ent. 2026.01		68.1
LN Entropy 2026.01		68.04
FActScore 2026.01		67.74
Perplexity 2026.01		67.51
Inside 2026.01		67.42
RACE 2026.01		66.9
SelfCheck GPT 2026.01		65.68
IO Prompt 2026.01		55.24