Generalizing Dataset Distillation via Deep Generative Prior

About

Dataset Distillation aims to distill an entire dataset's knowledge into a few synthetic images. The idea is to synthesize a small number of synthetic data points that, when given to a learning algorithm as training data, result in a model approximating one trained on the original data. Despite recent progress in the field, existing dataset distillation methods fail to generalize to new architectures and scale to high-resolution datasets. To overcome the above issues, we propose to use the learned prior from pre-trained deep generative models to synthesize the distilled data. To achieve this, we present a new optimization algorithm that distills a large number of images into a few intermediate feature vectors in the generative model's latent space. Our method augments existing techniques, significantly improving cross-architecture generalization in all settings.

George Cazenavette, Tongzhou Wang, Antonio Torralba, Alexei A. Efros, Jun-Yan Zhu• 2023

Related benchmarks

Task	Dataset	Result
Image Classification	ImageWoof (test)	Accuracy33.8	254
Image Classification	ImageNet I-Squawk (test)	Accuracy23.2	71
Image Classification	ImageNet-A (val)	Accuracy39.3	64
Image Classification	ImageNet-Woof (test)	Accuracy25.6	46
Image Classification	ImageNet I-Woof (test)	Accuracy32.9	36
Image Classification	PathMNIST v1 (test)	Accuracy46.68	36
Image Classification	ImageWoof 256x256 (test)	Accuracy33.8	26
Image Classification	ImageNet 128x128 (test)	Nette Accuracy38.7	26
Image Classification	ImageNet I-Fruit (test)	Accuracy21.4	23
Image Classification	ImageWoof full-sized (test)	Top-1 Accuracy33.8	23

Showing 10 of 43 rows

Other info

Code

Follow for update

@wizwand_team Discord