Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Factual Precision Evaluation on Bios
Loading...
83
FACTSCORE
MistralINST
-2.6128
19.6136
41.84
64.0664
Jul 4, 2024
FACTSCORE
FACTSCORER-ND
SAFE
Updated 4d ago
Evaluation Results
Method
Method
Links
FACTSCORE
FACTSCORER-ND
SAFE
MistralINST
Scenario=INFO, CORE=w/o
2024.07
83
75.9
84.8
GPT-2
Scenario=INFO, CORE=w/o
2024.07
82.2
78.1
70.3
MistralINST
Scenario=REP, CORE=w/o
2024.07
78
78.1
80.6
MistralINST
Scenario=NORMAL, CORE=w/o
2024.07
54
53.9
61.7
MistralINST
Scenario=NORMAL, CORE=w/
2024.07
49.6
48.3
61.3
MistralINST
Scenario=INFO, CORE=w/
2024.07
36.2
43.6
29.6
GPT-2
Scenario=REP, CORE=w/o
2024.07
35.4
40.5
36
MistralINST
Scenario=REP, CORE=w/
2024.07
21.9
26
14.5
GPT-2
Scenario=REP, CORE=w/
2024.07
5.35
7.32
4.37
GPT-2
Scenario=INFO, CORE=w/
2024.07
0.68
2.16
0.35
Feedback
Search any
task
Search any
task