Dataset Distillation with Convexified Implicit Gradients
About
We propose a new dataset distillation algorithm using reparameterization and convexification of implicit gradients (RCIG), that substantially improves the state-of-the-art. To this end, we first formulate dataset distillation as a bi-level optimization problem. Then, we show how implicit gradients can be effectively used to compute meta-gradient updates. We further equip the algorithm with a convexified approximation that corresponds to learning on top of a frozen finite-width neural tangent kernel. Finally, we improve bias in implicit gradients by parameterizing the neural network to enable analytical computation of final-layer parameters given the body parameters. RCIG establishes the new state-of-the-art on a diverse series of dataset distillation tasks. Notably, with one image per class, on resized ImageNet, RCIG sees on average a 108\% improvement over the previous state-of-the-art distillation algorithm. Similarly, we observed a 66\% gain over SOTA on Tiny-ImageNet and 37\% on CIFAR-100.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Image Classification | ImageNet I-Squawk (test) | Accuracy49.9 | 71 | |
| Image Classification | ImageNet-A (val) | Accuracy25.4 | 64 | |
| Image Classification | ImageNet-Woof (test) | Accuracy32.9 | 46 | |
| Image Classification | ImageNet I-Fruit (test) | Accuracy35.3 | 23 | |
| Image Classification | ImageNet I-Yellow (test) | Accuracy53.8 | 22 | |
| Image Classification | ImageNet B 2012 (val) | Estimated Accuracy21.3 | 17 | |
| Image Classification | ImageNet Meow (test) | Accuracy37.1 | 16 | |
| Image Classification | ImageNet-Nette (test) | Accuracy40 | 16 | |
| Image Classification | ImageNet Subset E (val) | Accuracy17.1 | 9 | |
| Image Classification | ImageNet Subset C (val) | Accuracy21.2 | 9 |