Is synthetic data from generative models ready for image recognition?
About
Recent text-to-image generation models have shown promising results in generating high-fidelity photo-realistic images. Though the results are astonishing to human eyes, how applicable these generated images are for recognition tasks remains under-explored. In this work, we extensively study whether and how synthetic images generated from state-of-the-art text-to-image generation models can be used for image recognition tasks, and focus on two perspectives: synthetic data for improving classification models in data-scarce settings (i.e. zero-shot and few-shot), and synthetic data for large-scale model pre-training for transfer learning. We showcase the powerfulness and shortcomings of synthetic data from existing generative models, and propose strategies for better applying synthetic data for recognition tasks. Code: https://github.com/CVMI-Lab/SyntheticData.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Image Classification | ImageNet-1K | Top-1 Acc39.07 | 1239 | |
| Image Classification | CIFAR-100 | -- | 435 | |
| Fine-grained Image Classification | Stanford Cars (test) | Accuracy92.9 | 372 | |
| Image Classification | FGVC-Aircraft (test) | Accuracy89.07 | 322 | |
| Image Classification | Stanford Cars (test) | Accuracy94.66 | 320 | |
| Image Classification | CUB-200-2011 (test) | Top-1 Acc89.54 | 303 | |
| Image Classification | DomainNet | Accuracy (ClipArt)22.38 | 238 | |
| Image Classification | ImageNet-100 | -- | 163 | |
| Class-incremental learning | ImageNet-R | Last Accuracy3.53 | 147 | |
| Image Classification | Stanford Dogs (test) | Top-1 Acc83.5 | 140 |