Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Language Reasoning on TruthfulQA
Loading...
40.15
Accuracy
FP16
21.2324
26.1437
31.055
35.9663
Mar 18, 2026
Accuracy
Updated 1mo ago
Evaluation Results
Method
Method
Links
Accuracy
FP16
Model=Qwen2.5-14B
2026.03
40.15
NSDS
Model=Qwen2.5-14B
2026.03
31.58
Kurtosis
Model=Qwen2.5-14B
2026.03
30.77
ZD
Model=Qwen2.5-14B
2026.03
30.45
MSE
Model=Qwen2.5-14B
2026.03
29.63
EWQ
Model=Qwen2.5-14B
2026.03
28.44
FP16
Model=Llama-2-13B
2026.03
25.95
NSDS
Model=Llama-2-13B
2026.03
23.86
KurtBoost
Model=Llama-2-13B
2026.03
23.04
EWQ
Model=Llama-2-13B
2026.03
22.85
ZD
Model=Llama-2-13B
2026.03
22.23
MSE
Model=Llama-2-13B
2026.03
21.96
Feedback
Search any
task
Search any
task