Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Factuality on TruthfulQA gen
Loading...
42
BLEU Acc
FGD
35.76
37.38
39
40.62
Mar 15, 2026
BLEU Acc
Updated 1mo ago
Evaluation Results
Method
Method
Links
BLEU Acc
FGD
Base Model=Llama 3.2 1...
2026.03
42
Baseline
Base Model=Llama 3.2 1...
2026.03
36
Vocab. Trimming
Base Model=Llama 3.2 1...
2026.03
36
SVDSoftmax
Base Model=Llama 3.2 1...
2026.03
36
FlashHead
Base Model=Llama 3.2 1...
2026.03
36
Feedback
Search any
task
Search any
task