Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Factuality evaluation on TruthfulQA multiple-choice
Loading...
19.9
TruthfulQA Delta (Δ)
WiSE-FT
-0.796
4.577
9.95
15.323
May 19, 2026
TruthfulQA Delta (Δ)
Updated 14d ago
Evaluation Results
Method
Method
Links
TruthfulQA Delta (Δ)
WiSE-FT
Model=Qwen3-4B
2026.05
19.9
LoRA
Model=Qwen3-4B
2026.05
16.4
SFT
Model=Qwen3-4B
2026.05
13.9
L2 Reg
Model=Qwen3-4B
2026.05
12
FLOW
Model=Qwen3-4B
2026.05
11.7
TALR
Model=Qwen3-4B
2026.05
5.4
DFT
Model=Qwen3-4B
2026.05
4.9
FINCH
Model=Qwen3-4B
2026.05
2.6
STM
Model=Qwen3-4B
2026.05
0.7
Base
Model=Qwen3-4B
2026.05
0
Feedback
Search any
task
Search any
task