Is synthetic data from generative models ready for image recognition?
About
Recent text-to-image generation models have shown promising results in generating high-fidelity photo-realistic images. Though the results are astonishing to human eyes, how applicable these generated images are for recognition tasks remains under-explored. In this work, we extensively study whether and how synthetic images generated from state-of-the-art text-to-image generation models can be used for image recognition tasks, and focus on two perspectives: synthetic data for improving classification models in data-scarce settings (i.e. zero-shot and few-shot), and synthetic data for large-scale model pre-training for transfer learning. We showcase the powerfulness and shortcomings of synthetic data from existing generative models, and propose strategies for better applying synthetic data for recognition tasks. Code: https://github.com/CVMI-Lab/SyntheticData.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Fine-grained Image Classification | Stanford Cars (test) | Accuracy92.9 | 348 | |
| Image Classification | Stanford Cars (test) | Accuracy94.66 | 306 | |
| Image Classification | CUB-200-2011 (test) | Top-1 Acc89.54 | 276 | |
| Image Classification | FGVC-Aircraft (test) | Accuracy89.07 | 231 | |
| Image Classification | Stanford Dogs (test) | Top-1 Acc83.5 | 85 | |
| Image Classification | DTD | -- | 79 | |
| Fine-grained visual classification | CUB-200-2011 (test) | Top-1 Acc0.828 | 70 | |
| Image Classification | Oxford-IIIT Pet (test) | Overall Accuracy92.9 | 59 | |
| Image Classification | AIR | Accuracy39.9 | 22 | |
| Fine grained classification | Aircraft (test) | Accuracy84.8 | 18 |