Do Multiple Instance Learning Models Transfer?
About
Multiple Instance Learning (MIL) is a cornerstone approach in computational pathology (CPath) for generating clinically meaningful slide-level embeddings from gigapixel tissue images. However, MIL often struggles with small, weakly supervised clinical datasets. In contrast to fields such as NLP and conventional computer vision, where transfer learning is widely used to address data scarcity, the transferability of MIL models remains poorly understood. In this study, we systematically evaluate the transfer learning capabilities of pretrained MIL models by assessing 11 models across 21 pretraining tasks for morphological and molecular subtype prediction. Our results show that pretrained MIL models, even when trained on different organs than the target task, consistently outperform models trained from scratch. Moreover, pretraining on pancancer datasets enables strong generalization across organs and tasks, outperforming slide foundation models while using substantially less pretraining data. These findings highlight the robust adaptability of MIL models and demonstrate the benefits of leveraging transfer learning to boost performance in CPath. Lastly, we provide a resource which standardizes the implementation of MIL models and collection of pretrained model weights on popular CPath tasks, available at https://github.com/mahmoodlab/MIL-Lab
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Molecular Classification | 33-task benchmark Molecular Classification | Average Accuracy75.6 | 24 | |
| Morphological Classification | 33-task benchmark Morphological Classification | Average Accuracy86.7 | 24 | |
| Morphological Classification | Morphological Classification | Average AUC88.1 | 24 | |
| Molecular Classification | Molecular Classification | Average AUC82.2 | 24 | |
| Classification | Task 13 | Accuracy91.9 | 16 | |
| Molecular Classification | Task 15 | Accuracy (ACC)59.8 | 16 | |
| Classification | Task 2 | Accuracy74.6 | 16 | |
| Classification | Task 5 | Accuracy98.1 | 16 | |
| Classification | Task 8 | Accuracy75.3 | 16 | |
| Classification | Task 10 | Accuracy79.7 | 16 |