Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Deep Probabilistic Supervision for Image Classification

About

Supervised training of deep neural networks for classification typically relies on hard targets, which promote overconfidence and can limit calibration, generalization, and robustness. Self-distillation methods aim to mitigate this by leveraging inter-class and sample-specific information present in the model's own predictions, but often remain dependent on hard targets without explicitly modeling predictive uncertainty. With this in mind, we propose Deep Probabilistic Supervision (DPS), a principled learning framework constructing sample-specific target distributions via statistical inference on the model's own predictions, remaining independent of hard targets after initialization. We show that DPS consistently yields higher test accuracy (e.g., +2.0% for DenseNet-264 on ImageNet) and significantly lower Expected Calibration Error (ECE) (-40% ResNet-50, CIFAR-100) than existing self-distillation methods. When combined with a contrastive loss, DPS achieves state-of-the-art robustness under label noise.

Anton Adel\"ow, Matteo Gamba, Atsuto Maki• 2025

Related benchmarks

TaskDatasetResultRank
Image ClassificationCIFAR-100 (test)
Accuracy89.54
3518
Image ClassificationCIFAR-10 (test)
Accuracy98.44
3381
Image ClassificationTinyImageNet (test)
Accuracy89.65
440
Image ClassificationImageNet (test)
Top-1 Accuracy79.88
299
CalibrationCIFAR-100 (test)
ECE0.85
104
Out-of-Distribution DetectionCIFAR-10 (ID) vs SVHN (OOD) (test)
AUROC98.03
88
Out-of-Distribution DetectionCIFAR100 (ID) vs SVHN (OOD) (test)
AUROC90.71
67
Image ClassificationCIFAR-10-C (test)
Accuracy (Clean)91.57
61
Image ClassificationCIFAR-10 40% asymmetric noise (test)
Final Accuracy95.6
42
Image ClassificationCIFAR-10 Symmetry-50% noise (test)
Accuracy (Test)0.962
36
Showing 10 of 15 rows

Other info

Follow for update