Net2Net: Accelerating Learning via Knowledge Transfer

About

We introduce techniques for rapidly transferring the information stored in one neural net into another neural net. The main purpose is to accelerate the training of a significantly larger neural net. During real-world workflows, one often trains very many different neural networks during the experimentation and design process. This is a wasteful process in which each new model is trained from scratch. Our Net2Net technique accelerates the experimentation process by instantaneously transferring the knowledge from a previous network to each new deeper or wider network. Our techniques are based on the concept of function-preserving transformations between neural network specifications. This differs from previous approaches to pre-training that altered the function represented by a neural net when adding layers to it. Using our knowledge transfer mechanism to add depth to Inception modules, we demonstrate a new state of the art accuracy rating on the ImageNet dataset.

Tianqi Chen, Ian Goodfellow, Jonathon Shlens• 2015

Related benchmarks

Task	Dataset	Result
Image Classification	CIFAR-100 (test)	Accuracy76.48	3518
Image Classification	CIFAR-10 (test)	Accuracy91.78	3381
Image Classification	ImageNet-1k 1.0 (test)	Top-1 Accuracy72.29	251
Image Classification	ImageNet	--	184
Continual Learning	CIFAR100 Split	Average Per-Task Accuracy16.9	117
Image Classification	MNIST (train)	Train Accuracy98.99	107
Regression	California Housing	MSE0.391	71
Continual Supervised Learning	CIFAR 5+1	Total Average Online Task Accuracy31.8	49
Continual Supervised Learning	CIFAR Random Label	Total Average Online Task Accuracy18	49
Continual Supervised Learning	Continual ImageNet	Total Average Online Task Accuracy70.7	49

Showing 10 of 19 rows

Other info

Follow for update

@wizwand_team Discord