Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Open-ended Generation on TruthfulQA single info 1.0 (test)
Loading...
66.6
Truthfulness Score
TACS-T
52.664
56.282
59.9
63.518
Mar 12, 2024
Truthfulness Score
Truthfulness * Informativeness Score
Updated 4d ago
Evaluation Results
Method
Method
Links
Truthfulness Score
Truthfulness * Informativeness Score
TACS-T
Backbone=Mistral-Instr...
2024.03
66.6
58
TACS-S
Backbone=Mistral-Instr...
2024.03
61.8
55.2
Mistral-Instruct-v0.2
Backbone=Mistral-Instr...
2024.03
59.9
52.7
TACS-S
Backbone=Llama 2-Chat,...
2024.03
59.4
55.4
TACS-T
Backbone=Llama 2-Chat,...
2024.03
56.9
53.2
Llama 2-Chat
Backbone=Llama 2-Chat,...
2024.03
55.1
51.6
ITI
Backbone=Llama 2-Chat,...
2024.03
53.2
49.9
Feedback
Search any
task
Search any
task