The Spatially-Correlative Loss for Various Image Translation Tasks
About
We propose a novel spatially-correlative loss that is simple, efficient and yet effective for preserving scene structure consistency while supporting large appearance changes during unpaired image-to-image (I2I) translation. Previous methods attempt this by using pixel-level cycle-consistency or feature-level matching losses, but the domain-specific nature of these losses hinder translation across large domain gaps. To address this, we exploit the spatial patterns of self-similarity as a means of defining scene structure. Our spatially-correlative loss is geared towards only capturing spatial relationships within an image rather than domain appearance. We also introduce a new self-supervised learning method to explicitly learn spatially-correlative maps for each specific translation task. We show distinct improvement over baseline models in all three modes of unpaired I2I translation: single-modal, multi-modal, and even single-image translation. This new loss can easily be integrated into existing network architectures and thus allows wide applicability.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Image-to-Image Translation | Horse -> Zebra | FID43.4 | 23 | |
| Object Detection | Foggy Cityscapes to Cityscapes (test) | AP (person)40.9 | 21 | |
| Unpaired Image-to-Image Translation | Cat → Dog v1 (test) | FID72.8 | 14 | |
| Unpaired Image-to-Image Translation | Cityscapes | Pixel Accuracy73.2 | 8 | |
| Unpaired Image-to-Image Translation | Horse-to-Zebra | FID38 | 8 | |
| Artistic Style Transfer | WikiArt Cezanne | FID141.3 | 8 | |
| Artistic Style Transfer | General Content Images | Inference Time (s)0.0365 | 8 | |
| Artistic Style Transfer | WikiArt Van Gogh | FID105.2 | 8 | |
| Artistic Style Transfer | WikiArt Ukiyoe | FID130.8 | 8 | |
| Artistic Style Transfer | WikiArt Gauguin | FID172.2 | 8 |