Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Prioritized Training on Points that are Learnable, Worth Learning, and Not Yet Learnt

About

Training on web-scale data can take months. But most computation and time is wasted on redundant and noisy points that are already learnt or not learnable. To accelerate training, we introduce Reducible Holdout Loss Selection (RHO-LOSS), a simple but principled technique which selects approximately those points for training that most reduce the model's generalization loss. As a result, RHO-LOSS mitigates the weaknesses of existing data selection methods: techniques from the optimization literature typically select 'hard' (e.g. high loss) points, but such points are often noisy (not learnable) or less task-relevant. Conversely, curriculum learning prioritizes 'easy' points, but such points need not be trained on once learned. In contrast, RHO-LOSS selects points that are learnable, worth learning, and not yet learnt. RHO-LOSS trains in far fewer steps than prior art, improves accuracy, and speeds up training on a wide range of datasets, hyperparameters, and architectures (MLPs, CNNs, and BERT). On the large web-scraped image dataset Clothing-1M, RHO-LOSS trains in 18x fewer steps and reaches 2% higher final accuracy than uniform data shuffling.

S\"oren Mindermann, Jan Brauner, Muhammed Razzak, Mrinank Sharma, Andreas Kirsch, Winnie Xu, Benedikt H\"oltgen, Aidan N. Gomez, Adrien Morisot, Sebastian Farquhar, Yarin Gal• 2022

Related benchmarks

TaskDatasetResultRank
Image ClassificationCIFAR-10 (test)
Accuracy94.1
3381
Commonsense ReasoningHellaSwag--
1891
Mathematical ReasoningGSM8K (test)
Accuracy70.7
770
Image ClassificationCIFAR-100
Accuracy79.63
691
Image ClassificationStanford Cars
Accuracy80.48
635
Mathematical ReasoningMATH (test)
Overall Accuracy38.8
433
Image ClassificationAircraft
Accuracy80.34
333
Reading ComprehensionBoolQ
Accuracy83.29
279
Image ClassificationOxford-IIIT Pet
Accuracy92.35
219
Image ClassificationImageNet-1K
Accuracy75.13
193
Showing 10 of 28 rows

Other info

Follow for update