Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Truthfulness Evaluation on TruthfulQA

16.9Reliability Score

Aligner

-1.9242.9637.8512.737Feb 4, 2024
Updated 4d ago

Evaluation Results

MethodLinks
2024.02
16.9-
2024.02
13-
2024.02
11.8-
2024.02
10.8-
2024.02
10.3-
2024.02
10.3-
2024.02
10-
2024.02
9.1-
2024.02
7.6-
2024.02
7.1-
2024.02
6.6-
2024.02
6.1-
2024.02
5.4-
2024.02
5.1-
2024.02
4.9-
2024.02
4.2-
2024.02
3.9-
2024.02
3.2-
2024.02
2.7-
2024.02
2.7-
2024.02
2-
2024.02
1.7-
2024.02
1.5-
2024.02
1.2-
2024.02
1-
2024.02
0.7-
2024.02
0.7-
2024.02
0.5-
2024.02
0.5-
2024.02
0-
2024.02
-0.2-
2024.02
-0.5-
2024.02
-1.2-
2024.02
-31.6
2024.02
-42.7
2024.02
-27.2
2024.02
-53
2024.02
-26.3
2024.02
-51.7
2024.02
-41.2
2024.02
-52
2024.07
-44
2024.07
-36
2024.07
-41
2024.07
-42
2024.07
-45