TRAK: Attributing Model Behavior at Scale

About

The goal of data attribution is to trace model predictions back to training data. Despite a long line of work towards this goal, existing approaches to data attribution tend to force users to choose between computational tractability and efficacy. That is, computationally tractable methods can struggle with accurately attributing model predictions in non-convex settings (e.g., in the context of deep neural networks), while methods that are effective in such regimes require training thousands of models, which makes them impractical for large models or datasets. In this work, we introduce TRAK (Tracing with the Randomly-projected After Kernel), a data attribution method that is both effective and computationally tractable for large-scale, differentiable models. In particular, by leveraging only a handful of trained models, TRAK can match the performance of attribution methods that require training thousands of models. We demonstrate the utility of TRAK across various modalities and scales: image classifiers trained on ImageNet, vision-language models (CLIP), and language models (BERT and mT5). We provide code for using TRAK (and reproducing our work) at https://github.com/MadryLab/trak .

Sung Min Park, Kristian Georgiev, Andrew Ilyas, Guillaume Leclerc, Aleksander Madry• 2023

Related benchmarks

Task	Dataset	Result
Contributor Attribution	Fashion Product	Diversity8.31	48
Contributor Attribution	ArtBench Post-Impressionism	Aesthetic Score8.31	36
Contributor Attribution	CIFAR-20	Inception Score24.08	32
Medical Image-Text Classification	Medical Specialties	Ophthalmology Performance43.25	30
Image-Text Retrieval	General Domain	Retrieval Score27.17	30
Image Classification	General Domain 31 tasks	CLS Score48.24	30
Contributor Attribution	ArtBench Post-Impressionism (test)	Aesthetic Score3.16	18
Contributor Attribution	CIFAR-20 (test)	Inception Score1.68	16
Audio Attribution	FMA Large (test)	R@199.3	15
Mislabeled Data Detection	GLUE	MRPC Score84.4	13

Showing 10 of 18 rows

Other info

Follow for update

@wizwand_team Discord