Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Learning Emergent Modular Representations in Multi-modality Medical Vision Foundation Models

About

Multi-modality medical vision (MV) foundation models (FM) are fundamentally challenged by pronounced Non-IID feature statistics across heterogeneous imaging modalities. Monolithic self-supervised optimization on such data induces conflicting gradients, driving representations to collapse toward modality-dominant shortcuts. This work reframes this failure as an imbalance between specialization and coordination in emergent modularity, and proposes Director-Experts (DEX), a modular network that explicitly regulates these dynamics in stacked modules. Each DEX module comprises a pool of experts, dynamically adapted by our image-wise activation strategy, autonomously specializing in modality-dominant statistics, together with a director, updated via our group exponential moving average, which distills multi-expert knowledge into a shared space for semantic integration across modalities, thus driving the emergence of modular representations. We curate a new benchmark, Medical Vision Universe, over 4 million images across 10 modalities, which provides a FM-level pre-training with the broadest coverage of distinct imaging modalities to our DEX. Extensive evaluations on 26 downstream tasks demonstrate improved optimization behavior and transferability, indicating DEX as a principled step toward general-purpose multi-modality medical AI. Our code and dataset will be opened at https://github.com/YutingHe-list/DEX.

Yuting He, Chenyu You, Shuo Li• 2026

Related benchmarks

TaskDatasetResultRank
Medical Image AnalysisFundus 2 tasks
Average Performance (%)75.7
13
Medical Image AnalysisPath 4 tasks
Average Performance69.9
13
Medical Image AnalysisX-ray 6 tasks
Average Performance85.9
13
Medical Image AnalysisUS 3 tasks
Average Performance (%)82.5
13
Medical Image Analysis26 Medical Downstream Tasks
Average Performance78.4
13
Medical Image AnalysisCT 2 tasks
Average Performance83.9
13
Medical Image AnalysisEndo
Average Performance65
13
Medical Image AnalysisMR 2 tasks
Average Performance72.9
13
Medical Image AnalysisOCT 2 tasks
Average Performance87.5
13
Medical Image AnalysisPhoto 2 tasks
Average Performance87.1
13
Showing 10 of 11 rows

Other info

Follow for update