Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Sequential Knowledge Editing on CounterFact
Loading...
100
Efficacy
GRACE
-4
23
50
77
Oct 31, 2024
Efficacy
Paraphrase
Specificity
Average Score
Updated 4d ago
Evaluation Results
Method
Method
Links
Efficacy
Paraphrase
Specificity
Average Score
GRACE
Model=GPT
2024.10
100
0.5
100
66.83
GRACE
Model=Llama
2024.10
99.9
0.25
99.97
66.71
D4S
Model=GPT
2024.10
99.1
47
63.7
69.93
D4S
Model=Llama
2024.10
96.68
46.66
72.45
71.93
MEMIT
Model=GPT
2024.10
86.2
59.5
31.3
59
FT
Model=GPT
2024.10
32.8
9
1
14.27
ROME
Model=Llama
2024.10
27.83
16.03
5.66
16.5
FT
Model=Llama
2024.10
8.46
4.07
2.03
4.85
ROME
Model=GPT
2024.10
0.6
0.7
0.6
0.63
MEMIT
Model=Llama
2024.10
0
0
6.72
2.24
Feedback
Search any
task
Search any
task