Do Deep Nets Really Need to be Deep?

About

Currently, deep neural networks are the state of the art on problems such as speech recognition and computer vision. In this extended abstract, we show that shallow feed-forward networks can learn the complex functions previously learned by deep nets and achieve accuracies previously only achievable with deep models. Moreover, in some cases the shallow neural nets can learn these deep functions using a total number of parameters similar to the original deep model. We evaluate our method on the TIMIT phoneme recognition task and are able to train shallow fully-connected nets that perform similarly to complex, well-engineered, deep convolutional architectures. Our success in training shallow neural nets to mimic deeper models suggests that there probably exist better algorithms for training shallow feed-forward nets than those currently available.

Lei Jimmy Ba, Rich Caruana• 2013

Related benchmarks

Task	Dataset	Result
Image Classification	CIFAR-100 (test)	Accuracy74.08	3518
Image Classification	ImageNet-1k (val)	--	1498
Image Classification	ImageNet-1K	Top-1 Acc56.58	1239
Image Classification	MNIST (test)	Accuracy99.49	894
Image Classification	CIFAR-100	--	691
Image Classification	CIFAR-10	--	507
Image Classification	TinyImageNet (test)	--	499
Image Classification	MNIST	--	417
Arithmetic Reasoning	GSM8K	Accuracy0.00e+0	272
Image Classification	Tiny-ImageNet	Accuracy34.2	269

Showing 10 of 23 rows

Other info

Follow for update

@wizwand_team Discord