Is synthetic data from generative models ready for image recognition?
About
Recent text-to-image generation models have shown promising results in generating high-fidelity photo-realistic images. Though the results are astonishing to human eyes, how applicable these generated images are for recognition tasks remains under-explored. In this work, we extensively study whether and how synthetic images generated from state-of-the-art text-to-image generation models can be used for image recognition tasks, and focus on two perspectives: synthetic data for improving classification models in data-scarce settings (i.e. zero-shot and few-shot), and synthetic data for large-scale model pre-training for transfer learning. We showcase the powerfulness and shortcomings of synthetic data from existing generative models, and propose strategies for better applying synthetic data for recognition tasks. Code: https://github.com/CVMI-Lab/SyntheticData.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Image Classification | ImageNet-1K | Top-1 Acc39.07 | 1239 | |
| Image Classification | CIFAR-100 | -- | 435 | |
| Fine-grained Image Classification | Stanford Cars (test) | Accuracy92.9 | 348 | |
| Image Classification | Stanford Cars (test) | Accuracy94.66 | 316 | |
| Image Classification | FGVC-Aircraft (test) | Accuracy89.07 | 305 | |
| Image Classification | CUB-200-2011 (test) | Top-1 Acc89.54 | 286 | |
| Image Classification | DomainNet | Accuracy (ClipArt)22.38 | 206 | |
| Image Classification | Stanford Dogs (test) | Top-1 Acc83.5 | 113 | |
| Class-incremental learning | ImageNet-R | Average Accuracy6.25 | 112 | |
| Image Classification | DTD | -- | 96 |