Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

MQUAKE

Benchmarks

Task NameDataset NameSOTA ResultTrend
Knowledge EditingMQuAKE
Edit Success Rate99.95
30
Instruction FollowingMQUAKE
Accuracy82.5
24
Knowledge EditingMQuAKE-3K (test)
Overall M-Acc.50.4168
16
Multi-hop Knowledge EditingMQuAKE-3K
Average Performance99.17
16
Knowledge EditingMQuAKE-Story 1.0 (test)
Fact Accuracy (Easy)100
14
Knowledge EditingMQuAKE Story
Fact Accuracy (Easy)100
14
Knowledge EditingMQuAKE-CF 1.0 (test)
Fact Accuracy (Easy)99.9
14
Multi-hop Knowledge EditingMQUAKE-T (All edited)
Accuracy78.16
12
Multi-hop Knowledge EditingMQUAKE-T (1 edited)
Accuracy97.7
12
Multi-hop Knowledge EditingMQUAKE-CF-3K (100 edited)
Accuracy56
12
Multi-hop Knowledge EditingMQUAKE-CF-3K (1 edited)
Accuracy67.27
12
Knowledge EditingMQuAKE-3K
Efficacy99.8
10
Multi-hop Question AnsweringMQuAKE
MHQ Accuracy31.6
10
Multi-hop Knowledge EditingMQUAKE-CF-3K All edited
Accuracy45.87
10
Knowledge EditingMQUAKE
Average Accuracy0.7589
8
Sequential Knowledge EditingMQuAKE
Efficacy97.4
8
Multi-hop Knowledge EditingMQuAKE CF v2
2-hop Score69.1
6
Multi-hop knowledge editingMQUAKE
Average Accuracy44.39
5
Knowledge EditingMQuAKE-3K
M-hop Success Rate59.253
4
Showing 19 of 19 rows