Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Factuality Error Localization on FRANK
Loading...
56.7
Accuracy (OutE)
Bart-Large
12.604
24.052
35.5
46.948
Jul 1, 2024
Accuracy (OutE)
Accuracy (EntE)
Accuracy (PredE)
Accuracy (CirE)
Accuracy (GramE)
Accuracy (LinkE)
Accuracy (CorefE)
Mean Accuracy
Updated 1mo ago
Evaluation Results
Method
Method
Links
Accuracy (OutE)
Accuracy (EntE)
Accuracy (PredE)
Accuracy (CirE)
Accuracy (GramE)
Accuracy (LinkE)
Accuracy (CorefE)
Mean Accuracy
Bart-Large
Protocol=Fine-tuned on...
2024.07
56.7
36.9
14.8
34
21.4
0
40
29.1
FineSurE
Backbone=GPT-4
2024.07
50.2
63.7
41.9
38.1
44.6
19.4
37.8
42.2
Random Guessing
Description=Randomly s...
2024.07
14.3
14.3
14.3
14.3
14.3
14.3
14.3
14.3
Feedback
Search any
task
Search any
task