Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Lifelong Knowledge Editing on GPT-2 XL (1024 Edits, test)
Loading...
19.3
Score (S)
WilKE
-0.772
4.439
9.65
14.861
Feb 16, 2024
Score (S)
Effectiveness (ES)
Generality (PS)
Locality (NS)
Retention (ERS)
Retention (ORS)
Updated 1mo ago
Evaluation Results
Method
Method
Links
Score (S)
Effectiveness (ES)
Generality (PS)
Locality (NS)
Retention (ERS)
Retention (ORS)
WilKE
Backbone=GPT-2 XL (1.5B)
2024.02
19.3
7,070
5,100
1,270
1,610
1,310
MEMIT
Backbone=GPT-2 XL (1.5B)
2024.02
13.2
9,250
5,510
3,500
660
660
ROME
Backbone=GPT-2 XL (1.5B)
2024.02
9.3
1,580
880
680
1,220
790
KE
Backbone=GPT-2 XL (1.5B)
2024.02
0
10
10
0
0
0
KN
Backbone=GPT-2 XL (1.5B)
2024.02
0
10
100
0
0
0
MEND
Backbone=GPT-2 XL (1.5B)
2024.02
0
50
10
40
0
0
Feedback
Search any
task
Search any
task