OmniFM: Toward Modality-Robust and Task-Agnostic Federated Learning for Heterogeneous Medical Imaging
About
Federated learning (FL) has become a promising paradigm for collaborative medical image analysis, yet existing frameworks remain tightly coupled to task-specific backbones and are fragile under heterogeneous imaging modalities. Such constraints hinder real-world deployment, where institutions vary widely in modality distributions and must support diverse downstream tasks. To address this limitation, we propose OmniFM, a modality- and task-agnostic FL framework that unifies training across classification, segmentation, super-resolution, visual question answering, and multimodal fusion without re-engineering the optimization pipeline. OmniFM builds on a key frequency-domain insight: low-frequency spectral components exhibit strong cross-modality consistency and encode modality-invariant anatomical structures. Accordingly, OmniFM integrates (i) Global Spectral Knowledge Retrieval to inject global frequency priors, (ii) Embedding-wise Cross-Attention Fusion to align representations, and (iii) Prefix-Suffix Spectral Prompting to jointly condition global and personalized cues, together regularized by a Spectral-Proximal Alignment objective that stabilizes aggregation. Experiments on real-world datasets show that OmniFM consistently surpasses state-of-the-art FL baselines across intra- and cross-modality heterogeneity, achieving superior results under both fine-tuning and training-from-scratch setups.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Medical Visual Question Answering | Federated Medical VQA Mixed-Modality Task 3 VQA-Med 2019-2021 SLAKE VQA-RAD (test) | SLAKE Score82.33 | 34 | |
| Super-Resolution | BreaKHis Scenario 1, x2 | PSNR42.93 | 10 | |
| Medical Visual Question Answering | Modality-Heterogeneous Federated Medical VQA Task 2 (eight modality-specific clients) | Client 1 Performance90.02 | 6 | |
| Multi-Modal Image Fusion | CT-MRI | VIF0.217 | 6 | |
| Multi-Modal Image Fusion | PET-MRI | VIF0.309 | 6 | |
| Multi-Modal Image Fusion | SPECT-MRI | VIF28 | 6 | |
| Multi-Modal Image Fusion | Average Heterogeneous Multi-modal Fusion | VIF26.9 | 6 | |
| Segmentation | FeTS Group 1 2022 (9 Clients) | Dice Score79.84 | 5 | |
| Segmentation | FeTS Group 2 (6 Clients) 2022 | Dice Coefficient80.82 | 5 | |
| Segmentation | FeTS2022 Group 3 (5 Clients) | Dice Score78.62 | 5 |