Distilling Dataset into Neural Field

About

Utilizing a large-scale dataset is essential for training high-performance deep learning models, but it also comes with substantial computation and storage costs. To overcome these challenges, dataset distillation has emerged as a promising solution by compressing the large-scale dataset into a smaller synthetic dataset that retains the essential information needed for training. This paper proposes a novel parameterization framework for dataset distillation, coined Distilling Dataset into Neural Field (DDiF), which leverages the neural field to store the necessary information of the large-scale dataset. Due to the unique nature of the neural field, which takes coordinates as input and output quantity, DDiF effectively preserves the information and easily generates various shapes of data. We theoretically confirm that DDiF exhibits greater expressiveness than some previous literature when the utilized budget for a single synthetic instance is the same. Through extensive experiments, we demonstrate that DDiF achieves superior performance on several benchmark datasets, extending beyond the image domain to include video, audio, and 3D voxel. We release the code at https://github.com/aailab-kaist/DDiF.

Donghyeok Shin, HeeSun Bae, Gyuwon Sim, Wanmo Kang, Il-Chul Moon• 2025

Related benchmarks

Task	Dataset	Result
Image Classification	CIFAR-100	Accuracy49.9	204
Image Classification	ImageNet I-Squawk (test)	Accuracy67	71
Image Classification	ImageMeow	Accuracy40.3	63
Image Classification	ImageYellow	Accuracy56.2	63
Image Classification	ImageFruit	Accuracy42	56
Image Classification	ImageSquawk	Accuracy62.6	52
Image Classification	ImageNet I-Woof (test)	Accuracy39.6	36
Image Classification	ImageNet 128x128 (test)	Nette Accuracy72	26
Image Classification	ImageNet I-Fruit (test)	Accuracy43.2	23
Image Classification	ImageNet I-Yellow (test)	Accuracy63.1	22

Showing 10 of 16 rows

Other info

Follow for update

@wizwand_team Discord