Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Knowledge Editing on E-VQA 1,000 sequential edits
Loading...
96.85
Reliability
DSCA
-3.8324
22.3063
48.445
74.5837
Apr 9, 2026
Reliability
Textual Generalization
Multimodal Generalization
Textual Locality
Multimodal Locality
Average Score
Updated 8d ago
Evaluation Results
Method
Method
Links
Reliability
Textual Generalization
Multimodal Generalization
Textual Locality
Multimodal Locality
Average Score
DSCA
Backbone=LLaVA-1.5-7B
2026.04
96.85
93.1
88
100
98.2
95.23
LiveEdit
Backbone=LLaVA-1.5-7B
2026.04
92.93
90.16
84.3
100
96.43
92.76
SERAC
Backbone=LLaVA-1.5-7B
2026.04
85.57
75.58
82.01
62.46
15.69
64.26
LTE
Backbone=LLaVA-1.5-7B
2026.04
83.93
82.55
81.34
83.97
73.09
80.98
MEND
Backbone=LLaVA-1.5-7B
2026.04
0.04
0.05
0.05
0.08
0.09
0.06
Feedback
Search any
task
Search any
task