Improving neural networks by preventing co-adaptation of feature detectors

About

When a large feedforward neural network is trained on a small training set, it typically performs poorly on held-out test data. This "overfitting" is greatly reduced by randomly omitting half of the feature detectors on each training case. This prevents complex co-adaptations in which a feature detector is only helpful in the context of several other specific feature detectors. Instead, each neuron learns to detect a feature that is generally helpful for producing the correct answer given the combinatorially large variety of internal contexts in which it must operate. Random "dropout" gives big improvements on many benchmark tasks and sets new records for speech and object recognition.

Geoffrey E. Hinton, Nitish Srivastava, Alex Krizhevsky, Ilya Sutskever, Ruslan R. Salakhutdinov• 2012

Related benchmarks

Task	Dataset	Result
Image Classification	CIFAR-10 (test)	--	3381
Image Classification	Permuted MNIST T=784 (test)	--	62
Low-shot Image Classification	ImageNet 1k (novel classes)	--	57
Low-shot Image Classification	ImageNet base and novel classes (test val)	Top-5 Acc (n=1)50.1	26
Image Captioning	MS-COCO Karpathy 2014 (test)	BLEU-424.4	24

Showing 5 of 5 rows

Other info

Code

Follow for update

@wizwand_team Discord