Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Hallucination Detection on HalluQA
Loading...
94.44
Accuracy
ANAH-v2-Stage3
35.3264
50.6732
66.02
81.3668
Jul 5, 2024
Oct 8, 2024
Jan 12, 2025
Apr 18, 2025
Jul 23, 2025
Oct 27, 2025
Jan 31, 2026
Accuracy
F1 Score
Updated 27d ago
Evaluation Results
Method
Method
Links
Accuracy
F1 Score
ANAH-v2-Stage3
Method=Zero-Shot
2024.07
94.44
-
ANAH-v2-Stage2
Method=Zero-Shot
2024.07
92.63
-
ANAH-v2-Stage1
Method=Zero-Shot
2024.07
91.74
-
GPT4
Method=Zero-Shot
2024.07
62.81
-
HalluClean
Backbone=GPT-3.5-turbo
2025.11
55
41.6
GPT-3.5-turbo
Backbone=GPT-3.5-turbo
2025.11
46.5
7
GAME-LoRA
Protocol=Training-time
2026.01
44.5
-
CAD
Protocol=Inference-time
2026.01
41.7
-
ME
Protocol=Training-time
2026.01
39.5
-
Disagreement
Protocol=Training-time
2026.01
39.1
-
Baseline
Architecture=LoRA, Los...
2026.01
37.6
-
ActDec
Protocol=Inference-time
2026.01
37.6
-
Feedback
Search any
task
Search any
task