Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Probabilistic Multiple-Choice on TruthfulQA (single info)
Loading...
59.2
MC1 Score
Mistral-Instruct-v0.2 + TACS-T
48.384
51.192
54
56.808
Mar 12, 2024
MC1 Score
MC2 Score
MC3 Score
Average Score
Updated 4d ago
Evaluation Results
Method
Method
Links
MC1 Score
MC2 Score
MC3 Score
Average Score
Mistral-Instruct-v0.2 + TACS-T
Backbone=Mistral-Instr...
2024.03
59.2
69
44.8
57.7
Mistral-Instruct-v0.2 + TACS-S
Backbone=Mistral-Instr...
2024.03
55.8
59.4
39.9
51.7
Mistral-Instruct-v0.2
Backbone=Mistral-Instr...
2024.03
53.6
56.4
37
49
Llama 2-Chat + TACS-S
Backbone=Llama 2-Chat,...
2024.03
50.8
57.8
33.7
47.5
Llama 2-Chat
Backbone=Llama 2-Chat
2024.03
50.6
51.7
31.1
44.5
Llama 2-Chat + ITI
Backbone=Llama 2-Chat,...
2024.03
50.6
51.2
30.5
44.1
Llama 2-Chat + TACS-T
Backbone=Llama 2-Chat,...
2024.03
48.8
56.7
33.4
46.3
Feedback
Search any
task
Search any
task