Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Factual Hallucination on TruthfulQA
Loading...
41.74
MC1 Score
DeLask
31.6208
34.2479
36.875
39.5021
May 30, 2026
MC1 Score
MC2 Score
Updated 1d ago
Evaluation Results
Method
Method
Links
MC1 Score
MC2 Score
DeLask
Model=LLaMA3-8B, Decod...
2026.05
41.74
58.76
Regular
Model=LLaMA3-8B, Decod...
2026.05
39.72
57.1
UAD
Model=LLaMA3-8B, Decod...
2026.05
39.59
57.44
DOLA
Model=LLaMA3-8B, Decod...
2026.05
38.43
56.38
ITI
Model=LLaMA3-8B, Decod...
2026.05
37.14
57.05
AD
Model=LLaMA3-8B, Decod...
2026.05
35.45
55.74
DeLask
Model=LLaMA2-7B, Decod...
2026.05
35.02
52.27
AD
Model=LLaMA2-7B, Decod...
2026.05
34.14
52.03
DOLA
Model=LLaMA2-7B, Decod...
2026.05
33.71
53.17
Regular
Model=LLaMA2-7B, Decod...
2026.05
33.51
51.12
ITI
Model=LLaMA2-7B, Decod...
2026.05
33.39
52.03
UAD
Model=LLaMA2-7B, Decod...
2026.05
32.01
52.47
Feedback
Search any
task
Search any
task