Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Retain-Neutral Surrogates for Min-Max Unlearning

About

Machine unlearning seeks to remove the influence of designated training data while preserving performance on the remaining data. Approximate unlearning can be viewed as a local editing problem; in min-max unlearning, the key local object is the surrogate point at which the retain objective is evaluated. When forget and retain gradients are strongly aligned, an unconstrained forget-maximizing perturbation can move to a surrogate point that increases retain loss. We propose Retain-Orthogonal Surrogate Unlearning (ROSU), which constrains the inner surrogate construction by maximizing first-order forget gain subject to zero first-order retain change under a fixed perturbation budget. This yields a closed-form retain-orthogonal perturbation, a lightweight transported outer update, and amplification along the retain-neutral direction. Our analysis establishes (i) a curvature-controlled second-order bound on retain damage, (ii) a positive-alignment regime in which ROSU strictly reduces surrogate retain loss relative to standard min-max perturbations, and (iii) near-equivalence when the two gradients are nearly orthogonal. Across vision and language benchmarks (CIFAR-10/100, Tiny-ImageNet, TOFU, WMDP), the empirical pattern follows this geometry: ROSU gives its clearest gains in high-coupling regimes while remaining competitive elsewhere.

Junhao Cai, Dohun Kim, Dowon Kim, Sung Il Choi, Chengjun Jin, Juhyun Park, Changhee Joo• 2026

Related benchmarks

TaskDatasetResultRank
LLM UnlearningWMDP
Delta Score5.4
14
LLM UnlearningTOFU
Aggregated Score58
13
Machine UnlearningCIFAR-10 Class-wise forgetting
Retention Accuracy (RA)99.96
11
Machine UnlearningCIFAR-10 Random forgetting
RA99.84
11
Class-wise Forgetting UnlearningCIFAR-100
Retention Accuracy (RA)99.98
11
Random Forgetting UnlearningCIFAR-100
Retention Accuracy99.78
11
Machine UnlearningTiny-ImageNet (Class-wise forgetting)
Retention Accuracy99.97
10
Machine UnlearningTiny-ImageNet Random forgetting
RA98.7
10
Showing 8 of 8 rows

Other info

Follow for update