BLT: Bidirectional Layout Transformer for Controllable Layout Generation
About
Creating visual layouts is a critical step in graphic design. Automatic generation of such layouts is essential for scalable and diverse visual designs. To advance conditional layout generation, we introduce BLT, a bidirectional layout transformer. BLT differs from previous work on transformers in adopting non-autoregressive transformers. In training, BLT learns to predict the masked attributes by attending to surrounding attributes in two directions. During inference, BLT first generates a draft layout from the input and then iteratively refines it into a high-quality layout by masking out low-confident attributes. The masks generated in both training and inference are controlled by a new hierarchical sampling policy. We verify the proposed model on six benchmarks of diverse design tasks. Experimental results demonstrate two benefits compared to the state-of-the-art layout transformer models. First, our model empowers layout transformers to fulfill controllable layout generation. Second, it achieves up to 10x speedup in generating a layout at inference time than the layout transformer baseline. Code is released at https://shawnkx.github.io/blt.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Unconditional Layout Generation | Rico | FID88.2 | 55 | |
| Conditional layout generation (Category to Size and Position) | PubLayNet | FID5.1 | 27 | |
| Conditional layout generation (Category to Size and Position) | Rico | FID4.48 | 27 | |
| Conditional Layout Generation | PubLayNet (test) | IoU0.19 | 12 | |
| Unconditional Layout Generation | PubLayNet | FID116 | 7 | |
| Generation from Types and Sizes (Gen-TS) | RICO (test) | mIoU60.4 | 3 | |
| Generation from Types (Gen-T) | RICO (test) | mIoU21.6 | 3 | |
| Generation from Types (Gen-T) | PubLayNet (test) | mIoU14 | 3 | |
| Generation from Types and Sizes (Gen-TS) | PubLayNet (test) | mIoU42.8 | 3 |