MIL-PF: Multiple Instance Learning on Precomputed Features for Mammography Classification
About
Modern foundation models provide highly expressive visual representations, yet adapting them to high-resolution medical imaging remains challenging due to limited annotations and weak supervision. Mammography, in particular, is characterized by large images, variable multi-view studies and predominantly breast-level labels, making end-to-end fine-tuning computationally expensive and often impractical. We propose Multiple Instance Learning on Precomputed Features (MIL-PF), a scalable framework that combines frozen foundation encoders with a lightweight MIL head for mammography classification. By precomputing the semantic representations and training only a small task-specific aggregation module (40k parameters), the method enables efficient experimentation and adaptation without retraining large backbones. The architecture explicitly models the global tissue context and the sparse local lesion signals through attention-based aggregation. MIL-PF achieves state-of-the-art classification performance at clinical scale while substantially reducing training complexity. We release the code for full reproducibility.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| BI-RADS-based malignancy classification | EMBED (test) | AUC91.8 | 13 | |
| Calcification Classification | VinDr Calcification | AUC96.7 | 13 | |
| Cancer Classification | RSNA Cancer | AUC92.5 | 13 | |
| Mass Classification | VinDr Mass | AUC0.814 | 13 | |
| BI-RADS-based malignancy classification | VinDr (test) | AUC91.1 | 13 | |
| Object Detection | VinDr Calcification | mAP38.4 | 4 | |
| Object Detection | VinDr Mass | mAP16.3 | 4 |