Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Deep Learning on a Data Diet: Finding Important Examples Early in Training

About

Recent success in deep learning has partially been driven by training increasingly overparametrized networks on ever larger datasets. It is therefore natural to ask: how much of the data is superfluous, which examples are important for generalization, and how do we find them? In this work, we make the striking observation that, in standard vision datasets, simple scores averaged over several weight initializations can be used to identify important examples very early in training. We propose two such scores -- the Gradient Normed (GraNd) and the Error L2-Norm (EL2N) scores -- and demonstrate their efficacy on a range of architectures and datasets by pruning significant fractions of training data without sacrificing test accuracy. In fact, using EL2N scores calculated a few epochs into training, we can prune half of the CIFAR10 training set while slightly improving test accuracy. Furthermore, for a given dataset, EL2N scores from one architecture or hyperparameter configuration generalize to other configurations. Compared to recent work that prunes data by discarding examples that are rarely forgotten over the course of training, our scores use only local information early in training. We also use our scores to detect noisy examples and study training dynamics through the lens of important examples -- we investigate how the data distribution shapes the loss surface and identify subspaces of the model's data representation that are relatively stable over training.

Mansheej Paul, Surya Ganguli, Gintare Karolina Dziugaite• 2021

Related benchmarks

TaskDatasetResultRank
Image ClassificationCIFAR-100 (test)
Accuracy76.89
3518
Image ClassificationCIFAR-10 (test)
Accuracy95.43
3381
Object Hallucination EvaluationPOPE
Accuracy79.5
1455
Visual Question AnsweringVQA v2
Accuracy76.1
1362
Visual Question AnsweringTextVQA
Accuracy50.2
1285
Graph ClassificationMUTAG
Accuracy88.2
862
Text-based Visual Question AnsweringTextVQA
Accuracy50.2
807
Image ClassificationCIFAR-100
Accuracy74.6
691
Multimodal EvaluationMME--
658
Image ClassificationCIFAR-10
Accuracy70.21
564
Showing 10 of 95 rows
...

Other info

Follow for update