Med-URWKV{\dag}: Toward Enhanced Pretrained Pure VRWKV Models for Medical Image Segmentation
About
Medical image segmentation is a fundamental task in computer-aided diagnosis and treatment. Existing approaches based on CNNs, ViTs, Mamba, and hybrid models still suffer from limitations such as restricted receptive fields, high computational cost, or insufficient accuracy. Recently, Vision Receptive-field Weighted Key-Value (VRWKV) models have emerged as a promising alternative,delivering strong long-range dependency modeling for visual tasks. However, current studies on VRWKV-based medical image segmentation mainly focus on hybrid architectures trained from scratch, while the potential of large-scale pretrained pure VRWKV models remains unexplored. In this work, we systematically investigate the effectiveness of pure VRWKV architectures for medical image segmentation. We construct Med-URWKV-T and Med-URWKV-S by reusing pretrained VRWKV encoders at different scales and pairing them with pure VRWKV decoders, enabling a comprehensive evaluation of pretrained pure VRWKV models in this domain. To further enhance performance, we propose two VRWKV-compatible modules: a Frequency-Aware Wavelet Attention (FAWA) module, which exploits wavelet transforms to capture edge details and structural characteristics, and a Multi-Scale Channel Fusion (MSCF) module, which integrates multi-scale features to strengthen informative channel representations. By incorporating them into Med-URWKV-T, we obtain the enhanced model Med-URWKV{\dag}. Extensive experiments on five medical image segmentation datasets demonstrate that Med-URWKV achieves performance comparable to or superior to state-of-the-art methods and carefully designed hybrid VRWKV architectures. Moreover, Med-URWKV{\dag} further improves segmentation accuracy, surpassing Med-URWKV-S while using only half of its parameter count, and achieves the highest average Dice similarity coefficient of 88.00%. The codes will be released.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Medical Image Segmentation | ISIC 2018 | Dice Score89.85 | 187 | |
| Medical Image Segmentation | BUSI | Dice Score80.9 | 134 | |
| Medical Image Segmentation | GLAS | Dice91.97 | 106 | |
| Medical Image Segmentation | ISIC 2017 | Dice Score85.91 | 102 | |
| Medical Image Segmentation | Kvasir-Seg | Dice Coefficient0.9141 | 59 | |
| Medical Image Segmentation | BUSI, ISIC-2017, ISIC-2018, Kvasir-SEG, GLAS | DSC88 | 20 |