Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Toward Efficient Influence Function: Dropout as a Compression Tool

About

Assessing the impact the training data on machine learning models is crucial for understanding the behavior of the model, enhancing the transparency, and selecting training data. Influence function provides a theoretical framework for quantifying the effect of training data points on model's performance given a specific test data. However, the computational and memory costs of influence function presents significant challenges, especially for large-scale models, even when using approximation methods, since the gradients involved in computation are as large as the model itself. In this work, we introduce a novel approach that leverages dropout as a gradient compression mechanism to compute the influence function more efficiently. Our method significantly reduces computational and memory overhead, not only during the influence function computation but also in gradient compression process. Through theoretical analysis and empirical validation, we demonstrate that our method could preserves critical components of the data influence and enables its application to modern large-scale models.

Yuchen Zhang, Mohammad Mohammadi Amiri• 2025

Related benchmarks

TaskDatasetResultRank
Language ModelingOpenWebText (test)
Average Perplexity25.51
31
Language ModelingCNN/Daily Mail (test)
Perplexity16.84
28
Mislabeled Data DetectionGLUE
MRPC Score84.2
13
Mislabeled Data DetectionGLUE (test val)
MRPC Score3.93
11
Language ModelingSix-source heterogeneous dataset (test)
Perplexity2.55
8
Image ClassificationCIFAR-10 (test)
Accuracy (5% removed)76.88
8
Model RetrainingResNet-9 (train)
Average Retraining Time (s)8.44
3
Influential training example identificationPythia 6.9B
Top-1 Accuracy83.1
2
Model RetrainingGPT-2 (train)
Average Retraining Time (s)9.76
2
Showing 9 of 9 rows

Other info

Follow for update