| Dataset Name | SOTA Method | Metric | Trend | ||
|---|---|---|---|---|---|
| ZSRE | AlphaEdit | Generality97.36 | 110 | 4d ago | |
| Counterfact | EAMET | Efficacy9,387 | 91 | 4d ago | |
| MMEdit E-VQA | In-Context Editing | Reliability100 | 61 | 4d ago | |
| VLKEB | LiveEdit | Reliability98.77 | 45 | 4d ago | |
| ZsRE 10,000 facts | GRACE | Reliability100 | 27 | 4d ago | |
| Counterfact 10,000 facts | GRACE | Relational Score10,000 | 27 | 4d ago | |
| MMEdit E-IC 1.0 (test) | Reliability100 | 24 | 4d ago | ||
| E-VQA MMEdit 1.0 (test) | Reliability100 | 24 | 4d ago | ||
| MzsRE Edit: EN, Test: EN | IKE | Reliability10,000 | 23 | 4d ago | |
| EVK-Bench | Embedding Stability (ES)100 | 21 | 4d ago | ||
| Counterfact Full (test) | MEMIT | Rel. Accuracy99 | 21 | 4d ago | |
| ZsRE (evaluation) | ONCEEDIT | Reliability99 | 21 | 4d ago | |
| ZsRE (test) | GRACE | Normalized Editing Time0.65 | 18 | 4d ago | |
| UnKEBench Paraphrased questions (Para.) | AnyEdit | Bert Score96.6 | 18 | 4d ago | |
| UnKEBench Original questions | AnyEdit* | BERTScore99.86 | 18 | 4d ago | |
| UnKEBench | COIN | Precision52.17 | 16 | 4d ago | |
| CHED | EMMET | S93.5 | 16 | 4d ago | |
| MMEdit E-IC | SERAC | Reliability99.7 | 16 | 4d ago | |
| Multilingual Knowledge Editing EN LLaMA backbone (test) | ReMaKE-few-bi | Reliability90.17 | 16 | 4d ago | |
| RIPPLEEDITS single-instance | FT | Reliability100 | 16 | 4d ago | |
| filtered dataset original (test) | ChainEdit + FT | Reliability1 | 16 | 4d ago | |
| in-prompt | FT | Reliability1 | 16 | 4d ago | |
| replaced | FT | Reliability100 | 16 | 4d ago | |
| MQuAKE-Story 1.0 (test) | Qwen3 (SFT) | Fact Accuracy (Easy)100 | 14 | 4d ago | |
| MQuAKE Story | Fact Accuracy (Easy)100 | 14 | 4d ago |