Using Pre-Training Can Improve Model Robustness and Uncertainty
About
He et al. (2018) have called into question the utility of pre-training by showing that training from scratch can often yield similar performance to pre-training. We show that although pre-training may not improve performance on traditional classification metrics, it improves model robustness and uncertainty estimates. Through extensive experiments on adversarial examples, label corruption, class imbalance, out-of-distribution detection, and confidence calibration, we demonstrate large gains from pre-training and complementary effects with task-specific methods. We introduce adversarial pre-training and show approximately a 10% absolute improvement over the previous state-of-the-art in adversarial robustness. In some cases, using pre-training without task-specific methods also surpasses the state-of-the-art, highlighting the need for pre-training when evaluating future methods on robustness and uncertainty tasks.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Image Classification | CIFAR-100 (test) | -- | 3518 | |
| Image Classification | TinyImageNet (test) | Accuracy63.8 | 366 | |
| Image Classification | ImageNet-A (test) | Top-1 Acc11.4 | 154 | |
| Image Classification | ImageNet-C (test) | -- | 110 | |
| Image Classification | ImageNet-R (test) | -- | 105 | |
| Image Classification | CIFAR-10C Severity Level 5 (test) | -- | 62 | |
| Image Classification | ImageNet-C level 5 | -- | 61 | |
| Image Classification | ImageNet-200 (test) | Top-1 Error Rate7 | 28 | |
| Image Classification | CIFAR-10 (test) | Clean Accuracy87.1 | 16 | |
| Image Classification | CIFAR-10 (test) | AutoAttack Accuracy54.92 | 14 |