Active Bias: Training More Accurate Neural Networks by Emphasizing High Variance Samples
About
Self-paced learning and hard example mining re-weight training instances to improve learning accuracy. This paper presents two improved alternatives based on lightweight estimates of sample uncertainty in stochastic gradient descent (SGD): the variance in predicted probability of the correct class across iterations of mini-batch SGD, and the proximity of the correct class probability to the decision threshold. Extensive experimental results on six datasets show that our methods reliably improve accuracy in various network architectures, including additional gains on top of other popular training techniques, such as residual learning, momentum, ADAM, batch normalization, dropout, and distillation.
Haw-Shiuan Chang, Erik Learned-Miller, Andrew McCallum• 2017
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Image Classification | CIFAR-100 (test) | -- | 3518 | |
| Image Classification | ANIMAL-10N (test) | Accuracy80.5 | 123 | |
| Multi-Label Classification | Corel5k | Ranking Loss0.1554 | 43 | |
| Image Classification | CIFAR-10 40% asymmetric noise (test) | Final Accuracy78 | 42 | |
| Multilabel Classification | mediamill (test) | Macro F1 Score13.95 | 39 | |
| Multi-Label Classification | MEDIAMILL | Macro-AUC86.99 | 32 | |
| Multi-Label Classification | RCV subset2 | Ranking Loss0.0525 | 32 | |
| Multi-Label Classification | Yeast | Macro-AUC0.7236 | 32 | |
| Multi-Label Classification | CAL500 | Macro-AUC58.26 | 32 | |
| Multi-Label Classification | RCV subset3 | Macro-AUC91.76 | 32 |
Showing 10 of 44 rows