Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Model Editing on COUNTERFACT 2,000-record GPT-J (test)
Loading...
91.5
Score (S)
ROME
20.884
39.217
57.55
75.883
Feb 10, 2022
Score (S)
Efficacy Score (ES)
Efficacy Magnitude (EM)
Paraphrase Score (PS)
Paraphrase Magnitude (PM)
Neighborhood Score (NS)
Neighborhood Magnitude (NM)
Generation Entropy (GE)
Reference Score (RS)
Updated 4d ago
Evaluation Results
Method
Method
Links
Score (S)
Efficacy Score (ES)
Efficacy Magnitude (EM)
Paraphrase Score (PS)
Paraphrase Magnitude (PM)
Neighborhood Score (NS)
Neighborhood Magnitude (NM)
Generation Entropy (GE)
Reference Score (RS)
ROME
Editor=ROME, Base Mode...
2022.02
91.5
99.9
99.4
99.1
74.1
78.9
5.2
620.1
43
FT+L
Editor=Fine-Tuning wit...
2022.02
68.7
99.6
95
47.9
30.4
78.6
6.8
622.8
35.5
MEND
Editor=MEND, Base Mode...
2022.02
63.2
97.4
71.5
53.6
11
53.9
-6
620.5
32.6
FT
Editor=Fine-Tuning, Ba...
2022.02
25.5
100
99.9
96.6
71
10.3
-50.7
387.8
24.6
GPT-J
Editor=Pre-edited base...
2022.02
23.6
16.3
-7.2
18.6
-7.4
83
7.3
621.8
29.8
Feedback
Search any
task
Search any
task