Normal Guidance is what Attention Needs
About
We consider training classifiers for 3D medical images using only one binary label for the entire volume rather than a label for each 2D slice. In such weakly supervised settings, can we learn accurate classifiers for slice-level predictions? Attention-based multiple instance learning (MIL) can produce an attention score for every slice. Yet recent work demonstrates that a simple center-focused baseline that ignores image content can outperform attention-based and transformer-based MIL at slice-level classification of 3D brain scans. We show this baseline also outperforms existing MIL at slice-level classification of thoracic and abdominal CT scans. Motivated by this baseline, we propose Normal Guidance, a regularization technique that encourages the learned attention distribution to follow a bell-shaped curve. Across three medical imaging datasets totaling over 4 million 2D slices, we show our Normal Guidance enables attention-based and transformer-based MIL methods to deliver significantly better slice-level localization than the state-of-the-art while remaining competitive at whole-scan classification.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Localization | Chest CT (test) | AUPRC0.535 | 10 | |
| Localization | Semi-Synthetic (test) | AUPRC57.8 | 10 | |
| Localization | Head CT (test) | AUPRC74.4 | 10 | |
| Localization | Abdomen CT (test) | AUPRC24.8 | 10 | |
| Whole-scan classification | Chest CT (test) | AUPRC47.7 | 9 | |
| Bag-level whole-scan classification | Chest CT (test) | AUROC67.8 | 9 | |
| Bag-level whole-scan classification | Abdomen CT (test) | AUROC0.684 | 9 | |
| Bag-level whole-scan classification | Head CT (test) | AUROC92.6 | 9 | |
| Whole-scan classification | Head CT (test) | AUPRC91.2 | 9 | |
| Whole-scan classification | Abdomen CT (test) | AUPRC13.9 | 9 |