Handwriting Transformers
About
We propose a novel transformer-based styled handwritten text image generation approach, HWT, that strives to learn both style-content entanglement as well as global and local writing style patterns. The proposed HWT captures the long and short range relationships within the style examples through a self-attention mechanism, thereby encoding both global and local style patterns. Further, the proposed transformer-based HWT comprises an encoder-decoder attention that enables style-content entanglement by gathering the style representation of each query character. To the best of our knowledge, we are the first to introduce a transformer-based generative network for styled handwritten text generation. Our proposed HWT generates realistic styled handwritten text images and significantly outperforms the state-of-the-art demonstrated through extensive qualitative, quantitative and human-based evaluations. The proposed HWT can handle arbitrary length of text and any desired writing style in a few-shot setting. Further, our HWT generalizes well to the challenging scenario where both words and writing style are unseen during training, generating realistic styled handwritten text images.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Handwritten Text Generation | IAM word-level | FID27.83 | 16 | |
| Handwriting generation | IAM (test) | FID19.4 | 9 | |
| Handwriting Synthesis | CVL line-level | FID31.22 | 8 | |
| Handwritten Text Generation | CVL Lines (test) | FID31.22 | 8 | |
| Line-level Text-to-Image Synthesis | Karaoke Typewritten (test) | FID72.78 | 8 | |
| Styled Text Generation | Karaoke (Typewritten) | FID72.78 | 8 | |
| Line-level Text-to-Image Synthesis | Karaoke Handwritten (test) | FID62.69 | 8 | |
| Styled Text Generation | Karaoke Calligraphy | FID62.69 | 8 | |
| Handwriting Synthesis | RIMES line-level | FID118.2 | 8 | |
| Handwritten Text Generation | RIMES Lines (test) | FID118.2 | 8 |