Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Generative Multiple-choice Question Answering on TruthfulQA
Loading...
76.3
TA Rate
Llama 2-Chat
42.084
50.967
59.85
68.733
Mar 12, 2024
TA Rate
UR Rate
DA Rate
Updated 4d ago
Evaluation Results
Method
Method
Links
TA Rate
UR Rate
DA Rate
Llama 2-Chat
Base Model=Llama 2-Cha...
2024.03
76.3
13.7
45
Mistral-Instruct-v0.2
Base Model=Mistral-Ins...
2024.03
75.4
22.6
49
Mistral-Instruct-v0.2 + TACS-S
Base Model=Mistral-Ins...
2024.03
46.3
91.4
68.9
Mistral-Instruct-v0.2 + TACS-T
Base Model=Mistral-Ins...
2024.03
44.9
89.6
67.2
Llama 2-Chat + TACS-S
Base Model=Llama 2-Cha...
2024.03
43.7
74.9
64.3
Llama 2-Chat + TACS-T
Base Model=Llama 2-Cha...
2024.03
43.4
85.8
64.7
Feedback
Search any
task
Search any
task