Erasing CLIP Memories: Non-Destructive, Data-Free Zero-Shot class Unlearning in CLIP Models

About

We introduce a novel, closed-form approach for selective unlearning in multimodal models, specifically targeting pretrained models such as CLIP. Our method leverages nullspace projection to erase the target class information embedded in the final projection layer, without requiring any retraining or the use of images from the forget set. By computing an orthonormal basis for the subspace spanned by target text embeddings and projecting these directions, we dramatically reduce the alignment between image features and undesired classes. Unlike traditional unlearning techniques that rely on iterative fine-tuning and extensive data curation, our approach is both computationally efficient and surgically precise. This leads to a pronounced drop in zero-shot performance for the target classes while preserving the overall multimodal knowledge of the model. Our experiments demonstrate that even a partial projection can balance between complete unlearning and retaining useful information, addressing key challenges in model decontamination and privacy preservation.

Ashish Mishra, Tarun Kumar, Gyanaranjan Nayak, Arpit Shah, Suparna Bhattacharya, Martin Foltin• 2025

Related benchmarks

Task	Dataset	Result
Image Classification	Oxford Flowers (test)	--	85
Machine Unlearning	Tiny-ImageNet	--	28
Image Classification	Target Classes Forget Set (test)	BF95.8	24
Image Classification	StanfordDogs (test)	BF Score61.4	18
Image Classification	StanfordCars standard (test)	BF Accuracy68.4	18
Image Classification	Caltech101 standard (test)	BF Score94.5	18
Zero-shot Class Unlearning	StanfordCars (forget set)	Bias Factor82.23	12
Zero-shot Class Unlearning	StanfordCars (retain set)	BF81.45	12
Zero-shot Class Unlearning	StanfordDogs (forget set)	BF80.38	12
Zero-shot Class Unlearning	StanfordDogs (retain set)	BF60.18	12

Showing 10 of 17 rows

Other info

Follow for update

@wizwand_team Discord