UVCGAN: UNet Vision Transformer cycle-consistent GAN for unpaired image-to-image translation
About
Unpaired image-to-image translation has broad applications in art, design, and scientific simulations. One early breakthrough was CycleGAN that emphasizes one-to-one mappings between two unpaired image domains via generative-adversarial networks (GAN) coupled with the cycle-consistency constraint, while more recent works promote one-to-many mapping to boost diversity of the translated images. Motivated by scientific simulation and one-to-one needs, this work revisits the classic CycleGAN framework and boosts its performance to outperform more contemporary models without relaxing the cycle-consistency constraint. To achieve this, we equip the generator with a Vision Transformer (ViT) and employ necessary training and regularization techniques. Compared to previous best-performing models, our model performs better and retains a strong correlation between the original and translated image. An accompanying ablation study shows that both the gradient penalty and self-supervised pre-training are crucial to the improvement. To promote reproducibility and open science, the source code, hyperparameter configurations, and pre-trained model are available at https://github.com/LS4GAN/uvcgan.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Image-to-Image Translation | CD3 (test) | PSNR19.47 | 28 | |
| Virtual Staining | IHC(CK8/18) (test) | PSNR19.3 | 27 | |
| Virtual Staining | HEMIT 13 (full dataset) | PSNR23.26 | 24 | |
| RAW-to-RAW mapping | RAW-to-RAW mapping dataset Samsung S9 -> iPhone-X MDRAW (test) | PSNR27.22 | 12 | |
| RAW-to-RAW mapping | MDRAW iPhone-X -> Samsung S9 (test) | PSNR26.1 | 12 | |
| RAW2RAW Translation | MDRAW | Avg MAE0.038 | 9 | |
| RGB-to-NIR translation | RANUS (test) | PSNR18.21 | 8 | |
| Image-to-Image Translation | IDD-AW (test) | PSNR27.63 | 7 | |
| RAW2RAW Translation | MDRAW Huawei to Samsung | MAE0.028 | 3 | |
| RAW2RAW Translation | MDRAW Huawei to iPhone | MAE3.3 | 3 |