Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Dataset Distillation with Neural Characteristic Function: A Minmax Perspective

About

Dataset distillation has emerged as a powerful approach for reducing data requirements in deep learning. Among various methods, distribution matching-based approaches stand out for their balance of computational efficiency and strong performance. However, existing distance metrics used in distribution matching often fail to accurately capture distributional differences, leading to unreliable measures of discrepancy. In this paper, we reformulate dataset distillation as a minmax optimization problem and introduce Neural Characteristic Function Discrepancy (NCFD), a comprehensive and theoretically grounded metric for measuring distributional differences. NCFD leverages the Characteristic Function (CF) to encapsulate full distributional information, employing a neural network to optimize the sampling strategy for the CF's frequency arguments, thereby maximizing the discrepancy to enhance distance estimation. Simultaneously, we minimize the difference between real and synthetic data under this optimized NCFD measure. Our approach, termed Neural Characteristic Function Matching (\mymethod{}), inherently aligns the phase and amplitude of neural features in the complex plane for both real and synthetic data, achieving a balance between realism and diversity in synthetic samples. Experiments demonstrate that our method achieves significant performance gains over state-of-the-art methods on both low- and high-resolution datasets. Notably, we achieve a 20.5\% accuracy boost on ImageSquawk. Our method also reduces GPU memory usage by over 300$\times$ and achieves 20$\times$ faster processing speeds compared to state-of-the-art methods. To the best of our knowledge, this is the first work to achieve lossless compression of CIFAR-100 on a single NVIDIA 2080 Ti GPU using only 2.3 GB of memory.

Shaobo Wang, Yicun Yang, Zhiyuan Liu, Chenghao Sun, Xuming Hu, Conghui He, Linfeng Zhang• 2025

Related benchmarks

TaskDatasetResultRank
Image ClassificationCIFAR-100 (test)
Accuracy54.7
3518
Image ClassificationCIFAR-10 (test)
Accuracy77.4
3381
Image ClassificationMNIST (test)
Accuracy96.9
894
Image ClassificationCIFAR-100
Accuracy54.7
691
Image ClassificationCIFAR-100
Accuracy44.8
302
ClassificationCIFAR10 (test)
Accuracy77.4
293
Image ClassificationCIFAR-10
Accuracy77.4
246
Image ClassificationCIFAR-10-LT
Accuracy75.6
146
Image ClassificationCIFAR-100 LT
Top-1 Acc48
131
ClassificationCIFAR-100 (test)
Accuracy54.7
129
Showing 10 of 22 rows

Other info

Follow for update