Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Study of Training Dynamics for Memory-Constrained Fine-Tuning

About

Memory-efficient training of deep neural networks has become increasingly important as models grow larger while deployment environments impose strict resource constraints. We propose TraDy, a novel transfer learning scheme leveraging two key insights: layer importance for updates is architecture-dependent and determinable a priori, while dynamic stochastic channel selection provides superior gradient approximation compared to static approaches. We introduce a dynamic channel selection approach that stochastically resamples channels between epochs within preselected layers. Extensive experiments demonstrate TraDy achieves state-of-the-art performance across various downstream tasks and architectures while maintaining strict memory constraints, achieving up to 99% activation sparsity, 95% weight derivative sparsity, and 97% reduction in FLOPs for weight derivative computation.

A\"el Qu\'elennec, Nour Hezbri, Pavlo Mozharovskyi, Van-Tam Nguyen, Enzo Tartaglione• 2025

Related benchmarks

TaskDatasetResultRank
Image ClassificationCIFAR-100 (test)--
3518
Image ClassificationCIFAR-10 (test)--
3381
Image ClassificationFlowers (test)
Accuracy90
87
Image ClassificationFood (test)
Accuracy84.76
50
Image ClassificationPets (test)--
36
Image ClassificationCUB (test)
Top-1 Accuracy75.89
31
Natural Language UnderstandingGLUE (test)
QNLI Score91.36
26
Visual Wake Words ClassificationVisual Wake Words (test)
Accuracy93.83
21
Image ClassificationAverage (CIFAR, CUB, Flowers, Food, Pets, VWW)
Top-1 Accuracy88.48
13
Image ClassificationVWW (test)
Top-1 Accuracy88.76
2
Showing 10 of 10 rows

Other info

Follow for update