Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Model Editing on ZsRE (Reliability, Generalization, MMLU, IFEval, TruthfulQA, ARC-C, GSM8K)

80.5Reliability

CrispEdit

-3.2218.51540.2561.985Feb 17, 2026
Updated 4d ago

Evaluation Results

MethodLinks
2026.02
80.56969.567.950.55576
2026.02
72.860.667.870.253.65274
2026.02
71.162.967.870.253.65274
2026.02
70.160.652.747.746.340.545.5
2026.02
69.559.769.570.151.65475.5
2026.02
57.450.969.567.950.55576
2026.02
48.139.452.747.746.340.545.5
2026.02
46.843.169.34548.74350
2026.02
25.222.169.570.151.65475.5
2026.02
22.717.469.372.551.854.573
2026.02
2016.369.372.551.854.573
2026.02
18.77.267.870.8525671
2026.02
16.615.569.229.650.84239.5
2026.02
9.98.369.34548.74350
2026.02
9.17.467.870.8525671
2026.02
4.4467.364.6564767
2026.02
3.63.568.819.452.840.56.5
2026.02
2.92.169.569.350.75873.5
2026.02
2.11.769.569.350.75873.5
2026.02
1.9269.229.650.84239.5
2026.02
1.30.967.364.6564767
2026.02
0.91.268.819.452.840.56.5
2026.02
0.1022.9051.323.50
2026.02
0.10.122.9051.323.50
2026.02
0022.918.20260
2026.02
0022.918.20260