Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Knowledge Editing on MQuAKE-3K
Loading...
99.8
Efficacy
ACE
49.88
62.84
75.8
88.76
Oct 9, 2025
Efficacy
Paraphrase
Specificity
Updated 1mo ago
Evaluation Results
Method
Method
Links
Efficacy
Paraphrase
Specificity
ACE
Base Model=GPT-J
2025.10
99.8
91.2
79.2
ACE
Base Model=Qwen3-8B
2025.10
99.4
94.2
81.8
FT
Base Model=GPT-J
2025.10
98.4
74.5
83.8
FT
Base Model=Qwen3-8B
2025.10
97.1
73.2
79.6
PMET
Base Model=GPT-J
2025.10
81.6
65.8
74.6
PMET
Base Model=Qwen3-8B
2025.10
75.6
68.9
64.4
ROME
Base Model=GPT-J
2025.10
64.2
61.6
66.8
MEMIT
Base Model=GPT-J
2025.10
62.8
66.2
70
MEMIT
Base Model=Qwen3-8B
2025.10
53.6
61.8
64.7
ROME
Base Model=Qwen3-8B
2025.10
51.8
49.3
57.2
Feedback
Search any
task
Search any
task