Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Multiple-choice tasks on FACTOR
Loading...
65.83
Accuracy (news)
TruthX
52.778
56.1665
59.555
62.9435
Feb 27, 2024
Accuracy (news)
Accuracy (expert)
Accuracy (wiki)
Updated 1mo ago
Evaluation Results
Method
Method
Links
Accuracy (news)
Accuracy (expert)
Accuracy (wiki)
TruthX
Backbone=Llama-2-7B-Chat
2024.02
65.83
65.25
57.18
Llama-2-7B-Chat
Backbone=Llama-2-7B-Chat
2024.02
64.67
64.83
56.95
ITI
Backbone=Llama-2-7B-Chat
2024.02
53.28
51.69
43.82
Feedback
Search any
task
Search any
task