MacNet: An End-to-End Manifold-Constrained Adaptive Clustering Network for Interpretable Whole Slide Image Classification
About
Whole slide images (WSIs) are the gold standard for pathological diagnosis and sub-typing. Current main-stream two-step frameworks employ offline feature encoders trained without domain-specific knowledge. Among them, attention-based multiple instance learning (MIL) methods are outcome-oriented and offer limited interpretability. Clustering-based approaches can provide explainable decision-making process but suffer from high dimension features and semantically ambiguous centroids. To this end, we propose an end-to-end MIL framework that integrates Grassmann re-embedding and manifold adaptive clustering, where the manifold geometric structure facilitates robust clustering results. Furthermore, we design a prior knowledge guiding proxy instance labeling and aggregation strategy to approximate patch labels and focus on pathologically relevant tumor regions. Experiments on multicentre WSI datasets demonstrate that: 1) our cluster-incorporated model achieves superior performance in both grading accuracy and interpretability; 2) end-to-end learning refines better feature representations and it requires acceptable computation resources.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Cancer diagnosis | CAMELYON-16 | Accuracy91.43 | 42 | |
| Cancer sub-typing | AMU-CSCC | ACC92.45 | 40 | |
| Cancer sub-typing | DHMC-LUNG | Accuracy0.8222 | 40 | |
| Cancer sub-typing | AMU-LSCC | ACC90.58 | 40 |