Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Factuality Error Localization on FRANK
Loading...
56.7
Accuracy (OutE)
Bart-Large
12.604
24.052
35.5
46.948
Jul 1, 2024
Accuracy (OutE)
Accuracy (EntE)
Accuracy (PredE)
Accuracy (CirE)
Accuracy (GramE)
Accuracy (LinkE)
Accuracy (CorefE)
Mean Accuracy
Updated 4d ago
Evaluation Results
Method
Method
Links
Accuracy (OutE)
Accuracy (EntE)
Accuracy (PredE)
Accuracy (CirE)
Accuracy (GramE)
Accuracy (LinkE)
Accuracy (CorefE)
Mean Accuracy
Bart-Large
Protocol=Fine-tuned on...
2024.07
56.7
36.9
14.8
34
21.4
0
40
29.1
FineSurE
Backbone=GPT-4
2024.07
50.2
63.7
41.9
38.1
44.6
19.4
37.8
42.2
Random Guessing
Description=Randomly s...
2024.07
14.3
14.3
14.3
14.3
14.3
14.3
14.3
14.3
Feedback
Search any
task
Search any
task