Normal Guidance is what Attention Needs

About

We consider training classifiers for 3D medical images using only one binary label for the entire volume rather than a label for each 2D slice. In such weakly supervised settings, can we learn accurate classifiers for slice-level predictions? Attention-based multiple instance learning (MIL) can produce an attention score for every slice. Yet recent work demonstrates that a simple center-focused baseline that ignores image content can outperform attention-based and transformer-based MIL at slice-level classification of 3D brain scans. We show this baseline also outperforms existing MIL at slice-level classification of thoracic and abdominal CT scans. Motivated by this baseline, we propose Normal Guidance, a regularization technique that encourages the learned attention distribution to follow a bell-shaped curve. Across three medical imaging datasets totaling over 4 million 2D slices, we show our Normal Guidance enables attention-based and transformer-based MIL methods to deliver significantly better slice-level localization than the state-of-the-art while remaining competitive at whole-scan classification.

Ethan Harvey, Dennis Johan Loevlie, Michael C. Hughes• 2026

Related benchmarks

Task	Dataset	Result
Localization	Chest CT (test)	AUPRC0.535	10
Localization	Semi-Synthetic (test)	AUPRC57.8	10
Localization	Head CT (test)	AUPRC74.4	10
Localization	Abdomen CT (test)	AUPRC24.8	10
Whole-scan classification	Chest CT (test)	AUPRC47.7	9
Bag-level whole-scan classification	Chest CT (test)	AUROC67.8	9
Bag-level whole-scan classification	Abdomen CT (test)	AUROC0.684	9
Bag-level whole-scan classification	Head CT (test)	AUROC92.6	9
Whole-scan classification	Head CT (test)	AUPRC91.2	9
Whole-scan classification	Abdomen CT (test)	AUPRC13.9	9

Showing 10 of 12 rows

Other info

Follow for update

@wizwand_team Discord