Shake-Shake regularization

About

The method introduced in this paper aims at helping deep learning practitioners faced with an overfit problem. The idea is to replace, in a multi-branch network, the standard summation of parallel branches with a stochastic affine combination. Applied to 3-branch residual networks, shake-shake regularization improves on the best single shot published results on CIFAR-10 and CIFAR-100 by reaching test errors of 2.86% and 15.85%. Experiments on architectures without skip connections or Batch Normalization show encouraging results and open the door to a large set of applications. Code is available at https://github.com/xgastaldi/shake-shake

Xavier Gastaldi• 2017

Related benchmarks

Task	Dataset	Result
Image Classification	CIFAR-100 (test)	--	3518
Image Classification	CIFAR-10 (test)	--	3381
Image Classification	SVHN 1000 labels (test)	Error Rate12.3	69
Image Classification	CIFAR-10 4,000 labels (test)	Test Error Rate13.4	62
Image Classification	CIFAR-10	Top-1 Error (%)0.02	32
Image Classification	CIFAR-10 full (test)	Error Rate2.6	18
Image Classification	SVHN full (test)	Error Rate1.2	6

Showing 7 of 7 rows

Other info

Follow for update

@wizwand_team Discord