ResViT: Residual vision transformers for multi-modal medical image synthesis
About
Generative adversarial models with convolutional neural network (CNN) backbones have recently been established as state-of-the-art in numerous medical image synthesis tasks. However, CNNs are designed to perform local processing with compact filters, and this inductive bias compromises learning of contextual features. Here, we propose a novel generative adversarial approach for medical image synthesis, ResViT, that leverages the contextual sensitivity of vision transformers along with the precision of convolution operators and realism of adversarial learning.} ResViT's generator employs a central bottleneck comprising novel aggregated residual transformer (ART) blocks that synergistically combine residual convolutional and transformer modules. Residual connections in ART blocks promote diversity in captured representations, while a channel compression module distills task-relevant information. A weight sharing strategy is introduced among ART blocks to mitigate computational burden. A unified implementation is introduced to avoid the need to rebuild separate synthesis models for varying source-target modality configurations. Comprehensive demonstrations are performed for synthesizing missing sequences in multi-contrast MRI, and CT images from MRI. Our results indicate superiority of ResViT against competing CNN- and transformer-based methods in terms of qualitative observations and quantitative metrics.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Across-modality synthesis (T2-weighted MRI to CT) | Pelvic MRI-CT dataset (test) | PSNR28.45 | 42 | |
| Multi-contrast MRI Synthesis (T2, PD -> T1) | IXI (test) | PSNR29.58 | 23 | |
| Many-to-one MRI Synthesis (T1, FLAIR -> T2) | BRATS (test) | PSNR26.9 | 21 | |
| Many-to-one MRI Synthesis (T2, FLAIR -> T1) | BRATS (test) | PSNR26.24 | 21 | |
| MRI Synthesis (T1, T2 to FLAIR) | BraTS 2018 | PSNR25.84 | 20 | |
| Multi-contrast MRI Synthesis (T1, PD -> T2) | IXI (test) | PSNR35.71 | 17 | |
| Multi-contrast MRI Synthesis (T1, T2 -> PD) | IXI (test) | PSNR33.92 | 17 | |
| Image Synthesis | IXI PD-w to T2-w (test) | PSNR (dB)34.24 | 14 | |
| Medical Image-to-Image Translation (T1→T2) | BraTS 2023 (test) | PSNR25.5658 | 14 | |
| Medical Image-to-Image Translation (T2→FLAIR) | BraTS 2023 (test) | PSNR25.0538 | 14 |