Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Erasing CLIP Memories: Non-Destructive, Data-Free Zero-Shot class Unlearning in CLIP Models

About

We introduce a novel, closed-form approach for selective unlearning in multimodal models, specifically targeting pretrained models such as CLIP. Our method leverages nullspace projection to erase the target class information embedded in the final projection layer, without requiring any retraining or the use of images from the forget set. By computing an orthonormal basis for the subspace spanned by target text embeddings and projecting these directions, we dramatically reduce the alignment between image features and undesired classes. Unlike traditional unlearning techniques that rely on iterative fine-tuning and extensive data curation, our approach is both computationally efficient and surgically precise. This leads to a pronounced drop in zero-shot performance for the target classes while preserving the overall multimodal knowledge of the model. Our experiments demonstrate that even a partial projection can balance between complete unlearning and retaining useful information, addressing key challenges in model decontamination and privacy preservation.

Ashish Mishra, Tarun Kumar, Gyanaranjan Nayak, Arpit Shah, Suparna Bhattacharya, Martin Foltin• 2025

Related benchmarks

TaskDatasetResultRank
Image ClassificationOxford Flowers (test)--
73
Image ClassificationTarget Classes Forget Set (test)
BF95.8
24
Image ClassificationStanfordDogs (test)
BF Score61.4
18
Image ClassificationStanfordCars standard (test)
BF Accuracy68.4
18
Image ClassificationCaltech101 standard (test)
BF Score94.5
18
Machine UnlearningTiny-ImageNet--
16
Zero-shot Class UnlearningStanfordCars (forget set)
Bias Factor82.23
12
Zero-shot Class UnlearningStanfordCars (retain set)
BF81.45
12
Zero-shot Class UnlearningStanfordDogs (forget set)
BF80.38
12
Zero-shot Class UnlearningStanfordDogs (retain set)
BF60.18
12
Showing 10 of 17 rows

Other info

Follow for update