Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Truthfulness on TruthfulQA (avg.@8)
Loading...
68.69
Truthfulness Avg.@8
PSFT
54.598
58.2565
61.915
65.5735
Aug 25, 2025
Truthfulness Avg.@8
Updated 4d ago
Evaluation Results
Method
Method
Links
Truthfulness Avg.@8
PSFT
Backbone=Llama3.1-8B-I...
2025.08
68.69
PSFTwarm-up
Backbone=Llama3.1-8B-I...
2025.08
68.55
PSFT
Backbone=Qwen2.5-7B-In...
2025.08
67.16
SFT
Backbone=Llama3.1-8B-I...
2025.08
67.08
SFT-KL
Backbone=Llama3.1-8B-I...
2025.08
67.03
PSFTwarm-up
Backbone=Qwen2.5-7B-In...
2025.08
66.37
Base
Backbone=Qwen2.5-7B-In...
2025.08
66.1
SFT
Backbone=Qwen2.5-7B-In...
2025.08
63.14
SFT-KL
Backbone=Qwen2.5-7B-In...
2025.08
61.31
Base
Backbone=Llama3.1-8B-I...
2025.08
55.14
Feedback
Search any
task
Search any
task