Arbitrary Style Transfer via Multi-Adaptation Network
About
Arbitrary style transfer is a significant topic with research value and application prospect. A desired style transfer, given a content image and referenced style painting, would render the content image with the color tone and vivid stroke patterns of the style painting while synchronously maintaining the detailed content structure information. Style transfer approaches would initially learn content and style representations of the content and style references and then generate the stylized images guided by these representations. In this paper, we propose the multi-adaptation network which involves two self-adaptation (SA) modules and one co-adaptation (CA) module: the SA modules adaptively disentangle the content and style representations, i.e., content SA module uses position-wise self-attention to enhance content representation and style SA module uses channel-wise self-attention to enhance style representation; the CA module rearranges the distribution of style representation based on content representation distribution by calculating the local similarity between the disentangled content and style features in a non-local fashion. Moreover, a new disentanglement loss function enables our network to extract main style patterns and exact content structures to adapt to various input images, respectively. Various qualitative and quantitative experiments demonstrate that the proposed multi-adaptation network leads to better results than the state-of-the-art style transfer methods.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Style Transfer | MS-COCO and WikiArt | Execution Time (s)0.024 | 48 | |
| Image Style Transfer | User Study | Overall Quality Score44.8 | 30 | |
| Image Style Transfer | (test) | Average Inference Time (s)0.03 | 22 | |
| Style Transfer | MS-COCO (content) + WikiArt (style) (test) | Lcont4.93 | 17 | |
| Artistic Style Transfer | MS-COCO content images and WikiArt style images 512x512 resolution (test) | FID (Artistic Style)31.282 | 13 | |
| Style Transfer | Style Transfer (test) | Lc2.29 | 11 | |
| Text-Guided Image Manipulation | Human Face images with 10 text conditions (test) | Style Score2.8 | 7 | |
| Video Style Transfer | 20 styles (test) | Style 1 Optical Flow Error463 | 7 | |
| Text-driven Style Transfer | Custom Stylized Images 10 text conditions (test) | CLIP Score0.2213 | 7 |