Shake-Shake regularization
About
The method introduced in this paper aims at helping deep learning practitioners faced with an overfit problem. The idea is to replace, in a multi-branch network, the standard summation of parallel branches with a stochastic affine combination. Applied to 3-branch residual networks, shake-shake regularization improves on the best single shot published results on CIFAR-10 and CIFAR-100 by reaching test errors of 2.86% and 15.85%. Experiments on architectures without skip connections or Batch Normalization show encouraging results and open the door to a large set of applications. Code is available at https://github.com/xgastaldi/shake-shake
Xavier Gastaldi• 2017
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Image Classification | CIFAR-100 (test) | -- | 3518 | |
| Image Classification | CIFAR-10 (test) | -- | 3381 | |
| Image Classification | SVHN 1000 labels (test) | Error Rate12.3 | 69 | |
| Image Classification | CIFAR-10 4,000 labels (test) | Test Error Rate13.4 | 57 | |
| Image Classification | CIFAR-10 | Top-1 Error (%)0.02 | 32 | |
| Image Classification | CIFAR-10 full (test) | Error Rate2.6 | 18 | |
| Image Classification | SVHN full (test) | Error Rate1.2 | 6 |
Showing 7 of 7 rows