Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Truthfulness Evaluation on TruthfulQA medical (test)

83.6Health Score

BioMistral 7B TIES

6.1226.23546.3566.465Feb 15, 2024
Updated 1mo ago

Evaluation Results

MethodLinks
2024.02
83.67542.144.461.3
2024.02
8068.842.144.458.8
2024.02
78.27536.855.661.4
2024.02
74.571.66056.165.6
2024.02
72.768.831.633.351.6
2024.02
70.97536.833.354
2024.02
69.159.55250.157.6
2024.02
69.168.836.833.352
2024.02
69.181.236.833.355.1
2024.02
67.35036.844.449.6
2024.02
65.562.542.144.453.6
2024.02
63.668.836.844.453.4
2024.02
61.856.231.644.448.5
2024.02
6043.842.144.447.5
2024.02
41.818.826.322.227.3
2024.02
4018.826.344.432.37
2024.02
36.42515.833.327.62
2024.02
34.512.515.833.324
2024.02
16.418.85.3010.1
2024.02
14.525009.8
2024.02
10.92510.5011.6
2024.02
9.12510.5011.1