Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Counterfact

Benchmarks

Task NameDataset NameSOTA ResultTrend
Knowledge EditingCounterfact
Efficacy9,387
362
Sequential model editingCounterfact
Efficacy99.88
81
Sequential Model EditingCounterFact T = 300
Efficacy99.3
36
Subject inference attackCounterFact batch-edit tasks
Recall100
36
Lifelong Model EditingCounterFact
Efficacy73.06
33
Sequential Knowledge EditingCounterFact sequential editing 10,000 Samples
Efficacy Success99.5
33
Knowledge EditingCounterfact uns
Edit Success Rate94.56
30
Model EditingCounterFact
Reliability92
30
Knowledge EditingCounterfact 10,000 facts
Relational Score10,000
27
Knowledge Model EditingCounterFact
Efficacy64.85
26
Model EditingCounterFact
Reliability79.4
26
Model EditingCounterFact
Efficacy96.12
24
Knowledge EditingCOUNTERFACT RS
Efficacy100
23
Classification ProbingCounterfact (test)
Probe Acc (Best Layer)89.6
21
Knowledge EditingCounterfact Full (test)
Rel. Accuracy99
21
One-Time EditCounterFact
Efficacy99.88
20
Knowledge EditingCounterfact
AVG Score92.96
20
Knowledge EditingCounterfact (first 2000 edits)
Accuracy99.95
17
Sequential Knowledge EditingCounterFact larger
Efficacy98.97
14
Sequential Knowledge EditingCounterFact top
Efficacy93.87
14
Lifelong Knowledge EditingCOUNTERFACT
Reliability67.1
14
Sequential Model EditingCounterFact T = 5000
Efficacy96.6
13
Model EditingCounterFact 3,000 samples (test)
Reliability9,980
13
FactCounterFact
Efficacy Score (%)76.54
12
Knowledge EditingCounterfact (test)
RwA99.86
12
Showing 25 of 54 rows