| Dataset Name | SOTA Method | Metric | Trend | ||
|---|---|---|---|---|---|
| Counterfact | EAMET | Efficacy9,387 | 301 | 10d ago | |
| ZSRE | MetaKE | Generality97.37 | 181 | 10d ago | |
| MMEdit E-VQA | In-Context Editing | Reliability100 | 61 | 1mo ago | |
| VLKEB | LiveEdit | Reliability98.77 | 45 | 1mo ago | |
| WikiUpdate | CoT2Edit | Edit Success85.89 | 30 | 10d ago | |
| Counterfact uns | CoT2Edit | Edit Success Rate94.56 | 30 | 10d ago | |
| MQuAKE | CoT2Edit | Edit Success Rate99.95 | 30 | 10d ago | |
| ZsRE 10,000 facts | GRACE | Reliability100 | 27 | 1mo ago | |
| Counterfact 10,000 facts | GRACE | Relational Score10,000 | 27 | 1mo ago | |
| MMEdit E-IC 1.0 (test) | Reliability100 | 24 | 1mo ago | ||
| E-VQA MMEdit 1.0 (test) | Reliability100 | 24 | 1mo ago | ||
| MzsRE Edit: EN, Test: EN | IKE | Reliability10,000 | 23 | 1mo ago | |
| MMEdit E-IC | SERAC | Reliability99.7 | 22 | 8d ago | |
| EVK-Bench | Embedding Stability (ES)100 | 21 | 1mo ago | ||
| Counterfact Full (test) | MEMIT | Rel. Accuracy99 | 21 | 1mo ago | |
| ZsRE (evaluation) | ONCEEDIT | Reliability99 | 21 | 1mo ago | |
| CHED | SUIT | S95.7 | 19 | 1mo ago | |
| ZsRE (test) | GRACE | Normalized Editing Time0.65 | 18 | 1mo ago | |
| UnKEBench Paraphrased questions (Para.) | AnyEdit | Bert Score96.6 | 18 | 1mo ago | |
| UnKEBench Original questions | AnyEdit* | BERTScore99.86 | 18 | 1mo ago | |
| MQuAKE-3K (test) | MCircKE | Overall M-Acc.50.4168 | 16 | 10d ago | |
| UnKEBench | COIN | Precision52.17 | 16 | 1mo ago | |
| Multilingual Knowledge Editing EN LLaMA backbone (test) | ReMaKE-few-bi | Reliability90.17 | 16 | 1mo ago | |
| RIPPLEEDITS single-instance | FT | Reliability100 | 16 | 1mo ago | |
| filtered dataset original (test) | ChainEdit + FT | Reliability1 | 16 | 1mo ago |