Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Truthfulness Evaluation on TruthfulQA generation
Loading...
7.9
Exclusive Catch Rate (@10%)
QWEN 2.5 7B-I
6.652
6.976
7.3
7.624
Apr 27, 2026
Exclusive Catch Rate (@10%)
Exclusive Catch Rate (@20%)
Updated 1mo ago
Evaluation Results
Method
Method
Links
Exclusive Catch Rate (@10%)
Exclusive Catch Rate (@20%)
QWEN 2.5 7B-I
Probe training=WikiTex...
2026.04
7.9
12.7
Phi-3 Mini-I
Probe training=WikiTex...
2026.04
7.7
13.1
Mistral 7B-I
Probe training=WikiTex...
2026.04
6.7
10.9
Feedback
Search any
task
Search any
task