Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Multiple-choice tasks on FACTOR
Loading...
65.83
Accuracy (news)
TruthX
52.778
56.1665
59.555
62.9435
Feb 27, 2024
Accuracy (news)
Accuracy (expert)
Accuracy (wiki)
Updated 3d ago
Evaluation Results
Method
Method
Links
Accuracy (news)
Accuracy (expert)
Accuracy (wiki)
TruthX
Backbone=Llama-2-7B-Chat
2024.02
65.83
65.25
57.18
Llama-2-7B-Chat
Backbone=Llama-2-7B-Chat
2024.02
64.67
64.83
56.95
ITI
Backbone=Llama-2-7B-Chat
2024.02
53.28
51.69
43.82
Feedback
Search any
task
Search any
task