PixelCAM: Pixel Class Activation Mapping for Histology Image Classification and ROI Localization
About
Weakly supervised object localization (WSOL) methods allow training models to classify images and localize ROIs. WSOL only requires low-cost image-class annotations yet provides a visually interpretable classifier. Standard WSOL methods rely on class activation mapping (CAM) methods to produce spatial localization maps according to a single- or two-step strategy. While both strategies have made significant progress, they still face several limitations with histology images. Single-step methods can easily result in under- or over-activation due to the limited visual ROI saliency in histology images and scarce localization cues. They also face the well-known issue of asynchronous convergence between classification and localization tasks. The two-step approach is sub-optimal because it is constrained to a frozen classifier, limiting the capacity for localization. Moreover, these methods also struggle when applied to out-of-distribution (OOD) datasets. In this paper, a multi-task approach for WSOL is introduced for simultaneous training of both tasks to address the asynchronous convergence problem. In particular, localization is performed in the pixel-feature space of an image encoder that is shared with classification. This allows learning discriminant features and accurate delineation of foreground/background regions to support ROI localization and image classification. We propose PixelCAM, a cost-effective foreground/background pixel-wise classifier in the pixel-feature space that allows for spatial object localization. Using partial-cross entropy, PixelCAM is trained using pixel pseudo-labels collected from a pretrained WSOL model. Both image and pixel-wise classifiers are trained simultaneously using standard gradient descent. In addition, our pixel classifier can easily be integrated into CNN- and transformer-based architectures without any modifications.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Localization | CAMELYON17 Center 1 | PxAP52.2 | 24 | |
| Localization | CAMELYON17 Center 2 | PxAP54.9 | 24 | |
| Localization | CAMELYON17 Center 0 | PxAP49.8 | 24 | |
| Localization | GlaS (test) | PxAP86.6 | 12 | |
| Classification | GlaS (test) | Accuracy100 | 11 | |
| Object Localization | GlaS center-wise (test) | PxAP86.6 | 6 | |
| Object Localization | CAMELYON17 center-wise Center 3 (test) | PxAP71.9 | 6 | |
| Object Localization | CAMELYON center-wise Center 4 17 (test) | PxAP50.6 | 6 | |
| Image Classification | CAMELYON17 center-wise Center 0 (test) | CL80.9 | 6 | |
| Image Classification | CAMELYON17 center-wise (Center 1) (test) | CL Score73.8 | 6 |