Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Model Editing on COUNTERFACT 2,000-record GPT-J (test)
Loading...
91.5
Score (S)
ROME
20.884
39.217
57.55
75.883
Feb 10, 2022
Score (S)
Efficacy Score (ES)
Efficacy Magnitude (EM)
Paraphrase Score (PS)
Paraphrase Magnitude (PM)
Neighborhood Score (NS)
Neighborhood Magnitude (NM)
Generation Entropy (GE)
Reference Score (RS)
Updated 1mo ago
Evaluation Results
Method
Method
Links
Score (S)
Efficacy Score (ES)
Efficacy Magnitude (EM)
Paraphrase Score (PS)
Paraphrase Magnitude (PM)
Neighborhood Score (NS)
Neighborhood Magnitude (NM)
Generation Entropy (GE)
Reference Score (RS)
ROME
Editor=ROME, Base Mode...
2022.02
91.5
99.9
99.4
99.1
74.1
78.9
5.2
620.1
43
FT+L
Editor=Fine-Tuning wit...
2022.02
68.7
99.6
95
47.9
30.4
78.6
6.8
622.8
35.5
MEND
Editor=MEND, Base Mode...
2022.02
63.2
97.4
71.5
53.6
11
53.9
-6
620.5
32.6
FT
Editor=Fine-Tuning, Ba...
2022.02
25.5
100
99.9
96.6
71
10.3
-50.7
387.8
24.6
GPT-J
Editor=Pre-edited base...
2022.02
23.6
16.3
-7.2
18.6
-7.4
83
7.3
621.8
29.8
Feedback
Search any
task
Search any
task